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OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score : 
Sequence : 

Scoring table: 



Searched: 



November 13, 2003, 09:39:50 ; Search time 10.6875 Seconds 

( wi t hou t a 1 ignmen t s ) 
35.630 Million cell updates/sec 

US-09-228-866-1 
54 

1 CNSRLHLRC 9 
BLOSUM62 

Gapop 10.0 , Gapext 0.5 



328717 



328717 seqs, 42310858 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_AA: * 

1 : / cgn2_6 /pt odata / 1 / iaa/ 5A_COMB . pep : * 

2 : / cgn2_6 /p toda t a / 1 / iaa/ 5B_COMB . pep : * 

3 : /cgn2_6/ptodata/l/iaa/6A_COMB .pep : * 

4 : /cgn2_6/ptodata/l/iaa/6B_COMB .pep : * 

5 : /cgn2_6/ptodata/l/iaa/PCTUS_C0MB .pep : * 

6 : /cgn2_6/ptodata/l/iaa/backf ilesl .pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-08-526-710-1 

; Sequence 1, Application US/08526710 
; Patent No. 5622699 
; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell and Flores 

STREET: 4370 La Jolla Village Drive, Suite 700 
; CITY: San Diego 

; STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 



MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC- DOS/MS -DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/526 , 710 

FILING DATE: ll-SEP-1995 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 

REFERENCE/DOCKET NUMBER: P-LJ 1779 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
; INFORMATION FOR SEQ ID NO: 1: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 9 amino acids 

; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-08-526-710-1 



Query Match 100 . 

Best Local Similarity 100. 
Matches 9; Conservative 

Qy 1 CNSRLHLRC 9 

MINIMI 

Db 1 CNSRLHLRC 9 



0%; Score 54; DB 1; Length 9; 
0%; Pred. No. 2.5e+05; 

0; Mismatches 0; Indels 0; Gaps 0; 



RESULT 2 
US-08-862-855-1 

; Sequence 1, Application US/08862855 
; Patent No. 6068829 
; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 
/ APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 
STREET: 4370 La Jolla Village Drive, Suite 700 
CITY: San Diego 
; STATE: California 

COUNTRY: United States 
ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08 /862 , 855 
FILING DATE: 



CLASSIFICATION: 424 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 

FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,273 

FILING DATE: 10-MAR-1997 
ATTORNEY /AGENT INFORMATION: 

NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 

REFERENCE/DOCKET NUMBER: P-LJ 2621 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
; INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 9 amino acids 

TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-08-862-855-1 



Query Match 100.0%; Score 54; DB 3; Length 9; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 CNSRLHLRC 9 

UNI 

Db 1 CNSRLHLRC 9 



RESULT 3 
US-09-226-985-1 

; Sequence 1, Application US/09226985 

; Patent No. 6296832 

; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Molecules That Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 

STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/226,985 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 



APPLICATION NUMBER; US 08/526,710 
FILING DATE: ll-SEP-1995 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,273 
FILING DATE: 10-MAR-1997 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/862,855 
FILING DATE: 23-MAY-1997 

ATTORNEY/AGENT INFORMATION; 
NAME: Campbell, Cathryn A. 
REGISTRATION NUMBER: 31,815 
REFERENCE/DOCKET NUMBER: P-LJ 3423 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: (619) 535-9001 
TELEFAX: (619) 535-8949 
; INFORMATION FOR SEQ ID NO: 1: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 9 amino acids 

; TYPE: amino acid 

TOPOLOGY: linear 

MOLECULE TYPE: peptide 
US-09-226-985-1 



Query Match 100.0%; Score 54; DB 3; Length 9; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 CNSRLHLRC 9 

Mi ll 

Db 1 CNSRLHLRC 9 



RESULT 4 
US-09-227-906-1 

; Sequence 1, Application US/09227906 

; Patent No. 6306365 

; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 
; STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS -DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/227 , 906 

FILING DATE: 



CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 
FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,273 
FILING DATE: 10-MAR-1997 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/862,855 
FILING DATE: 23-MAY-1997 
ATTORNEY/AGENT INFORMATION: 
NAME: Campbell, Cathryn A. 
REGISTRATION NUMBER: 31,815 
REFERENCE/DOCKET NUMBER: P-LJ 3424 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (619) 535-9001 
TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 9 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-09-227-906-1 



Query Match 100.0%; 
Best Local Similarity 100.0%; 
Matches 9; Conservative 



0; 



Score 54; DB 4; Length 9; 
Pred. No. 2.5e+05; 

Mismatches 0 ; Indels 



0 ; Gaps 



0; 



Qy 

Db 



1 CNSRLHLRC 9 

MINIMI 

1 CNSRLHLRC 9 



RESULT 5 
US-08-526-710-5 

; Sequence 5, Application US/08526710 
; Patent No. 5622699 
; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell and Flores 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 

STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 



APPLICATION NUMBER: US/ 08/526 , 710 

FILING DATE: ll-SEP-1995 

CLASSIFICATION : 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 

REFERENCE/DOCKET NUMBER: P-LJ 1779 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 9 amino acids 

TYPE: amino acid 

TOPOLOGY : 1 inear 
MOLECULE TYPE: peptide 
US-08-526-710-5 

Query Match 85.2%; Score 46; DB 1; Length 9; 

Best Local Similarity 88.9%; Pred. No. 2.5e+05; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

Mill III 
Db 1 CNSRLQLRC 9 



RESULT 6 
US-08-862-855-5 

; Sequence 5, Application US/08862855 

; Patent No. 6068829 

; GENERAL INFORMATION: 

; APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 

STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.25 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08 /8 62 , 855 

FILING DATE: 

CLASSIFICATION: 424 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 

FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 



APPLICATION NUMBER: US 08/813,2 73 

FILING DATE: 10-MAR-1997 
ATTORNEY/AGENT INFORMATION: 

NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER : 31,815 

REFERENCE /DOCKET NUMBER: P-LJ 2 621 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 9 amino acids 
; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-08-862-855-5 

Query Match 85.2%; Score 46; DB 3; Length 9; 

Best Local Similarity 88.9%; Pred. No. 2.5e+05; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

Mill III 
Db 1 CNSRLQLRC 9 



RESULT 7 
US-09-226-985-5 

; Sequence 5, Application US/09226985 
; Patent No. 6296832 
; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Molecules That Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 
; STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS -DOS 

SOFTWARE: Patent In Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/226,985 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 

FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,273 

FILING DATE: 10-MAR-1997 



PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/862,855 
FILING DATE: 23 -MAY- 1997 
ATTORNEY/AGENT INFORMATION: 
NAME: Campbell, Cathryn A. 
REGISTRATION NUMBER: 31,815 
REFERENCE / DOCKET NUMBER : P-LJ 3423 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (619) 535-9001 
TELEFAX: (619) 535-8949 
; INFORMATION FOR SEQ ID NO: 5: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 9 amino acids 

; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-09-226-985-5 

Query Match 85.2%; Score 46; DB 3; Length 9; 

Best Local Similarity 88.9%; Pred. No. 2.5e+05; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

Mill III 
Db 1 CNSRLQLRC 9 



RESULT 8 
US-09-227-906-5 

; Sequence 5, Application US/09227906 

; Patent No. 6306365 

; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 
; APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 
; STATE: California 

COUNTRY: United States 

ZIP : 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patent In Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 0 9/227 , 9 06 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 

FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 



APPLICATION NUMBER: US 08/813,273 

FILING DATE: 10-MAR-1997 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/862,855 

FILING DATE: 23-MAY-1997 
ATTORNEY/AGENT INFORMATION: 

NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 

REFERENCE/DOCKET NUMBER: P-LJ 3424 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 9 amino acids 
; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-09-227-906-5 

Query Match 85.2%; Score 46; DB 4; Length 9; 

Best Local Similarity 88.9%; Pred. No. 2.5e+05; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

Mill III 
Db 1 CNSRLQLRC 9 



RESULT 9 
US-09-738-946-8 

; Sequence 8, Application US/09738946 

; Patent No. 6579701 

; GENERAL INFORMATION: 

; APPLICANT: EXELIXIS, INC. 

; TITLE OF INVENTION: DROSOPHILA HOMOLOGUES OF GENES AND PROTEINS IMPLICATED 
IN CANCER AND 

; TITLE OF INVENTION: METHODS OF USE 

; FILE REFERENCE: EX00-043C 

; CURRENT APPLICATION NUMBER: US/09/738 , 946 

; CURRENT FILING DATE: 2000-12-14 

; PRIOR APPLICATION NUMBER: 60/170,832 

; PRIOR FILING DATE: 1999-12-14 

; PRIOR APPLICATION NUMBER: 60/170,838 

; PRIOR FILING DATE: 1999-12-14 

; PRIOR APPLICATION NUMBER: 60/178,580 

; PRIOR FILING DATE: 2000-01-28 

; PRIOR APPLICATION NUMBER: 60/185,879 

; PRIOR FILING DATE: 2000-02-29 

; PRIOR APPLICATION NUMBER: 60/185,880 

; PRIOR FILING DATE: 2000-02-29 

; PRIOR APPLICATION NUMBER: 60/186,150 

; PRIOR FILING DATE: 2000-03-01 

; PRIOR APPLICATION NUMBER: 60/189,701 

; PRIOR FILING DATE: 2000-03-15 

; NUMBER OF SEQ ID NOS : 14 

; SOFTWARE: Patentln version 3.0 



SEQ ID NO 8 
LENGTH: 781 
TYPE : PRT 

ORGANISM: Drosophila melanogaster 
US-09-738-946-8 

Query Match 66.7%; Score 36; DB 4 ; Length 781; 

Best Local Similarity 66.7%; Pred. No, 1.4e+02; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 

Qy 1 CNSRLHLRC 9 

III! I I 
Db 52 9 CNSRGHCHC 53 7 



RESULT 10 

US- 09-252 -991A-28422 

; Sequence 28422, Application US/09252991A 
; Patent No. 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 

; CURRENT APPLICATION NUMBER: US/09/252 , 991A 

; CURRENT FILING DATE: 1999-02-18 

; PRIOR APPLICATION NUMBER: US 60/074,788 

; PRIOR FILING DATE: 1998-02-18 

; PRIOR APPLICATION NUMBER: US 60/094,190 

; PRIOR FILING DATE: 1998-07-27 

; NUMBER OF SEQ ID NOS : 33142 

; SEQ ID NO 28422 

LENGTH: 3 97 

TYPE : PRT 
; ORGANISM: Pseudomonas aeruginosa 
US- 09-252 -991A-28422 

Query Match 64.8%; Score 35; DB 4; Length 3 97; 

Best Local Similarity 100.0%; Pred. No. l.le+02; 

Matches 6 ; Conservative 0 ; Mismatches 0; Indels 0; Gaps 
Qy 4 RLHLRC 9 

Db 14 6 RLHLRC 151 



RESULT 11 

US- 09-252 -991A-28223 

; Sequence 28223, Application US/09252991A 
; Patent No. 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 



; CURRENT APPLICATION NUMBER: US/09/2 52 , 9 91A 

; CURRENT FILING DATE: 1999-02-18 

; PRIOR APPLICATION NUMBER: US 60/074,788 

; PRIOR FILING DATE: 1998-02-18 

; PRIOR APPLICATION NUMBER: US 60/094,190 

; PRIOR FILING DATE: 1998-07-27 

; NUMBER OF SEQ ID NOS : 33142 

; SEQ ID NO 28223 

LENGTH: 511 

TYPE: PRT 
; ORGANISM: Pseudomonas aeruginosa 
US- 09-252 -9 91A-2 8223 

Query Match 64.8%; Score 35; DB 4; Length 511; 

Best Local Similarity 66.7%; Pred. No. 1.4e+02; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

I II I II 
Db 13 0 CPSRTHRRC 138 



RESULT 12 
US-09-330-740A-8 

; Sequence 8, Application US/09330740A 
; Patent No. 6291217 
; GENERAL INFORMATION: 

APPLICANT: Floh , Leopold 

APPLICANT: Koenig, Kerstin 
; APPLICANT: Menge , Ulrich 

; TITLE OF INVENTION: Glutathionyl spermidine Synthetase and 
TITLE OF INVENTION: Processes for Recovery and Use Thereof 
NUMBER OF SEQUENCES: 9 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Marshall, O 1 Toole, Gerstein, Murray & Borun 

STREET: 233 South Wacker Drive/ 6300 Sears Tower 

CITY: Chicago 

STATE: Illinois 

COUNTRY: United States of America 

ZIP: 60606 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.30 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/33 0 , 74 OA 

FILING DATE: ll-JUN-1999 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO PCT/EP97/06982 

FILING DATE: 12-DEC-1997 
ATTORNEY/AGENT INFORMATION: 

NAME: Zeller, James P. 

REGISTRATION NUMBER: 28,4 91 

REFERENCE/DOCKET NUMBER: 29473/35677 
TELECOMMUNICATION INFORMATION: 



TELEPHONE: (312) 474-6300 
TELEFAX: (312) 474-0448 
INFORMATION FOR SEQ ID NO: 8: 
SEQUENCE CHARACTER I STI CS : 
LENGTH: 573 amino acids 
TYPE: amino acid 
STRANDEDNESS : single 
TOPOLOGY : 1 inear 
MOLECULE TYPE: protein 
HYPOTHETICAL: NO 
ANTI -SENSE: NO 
FEATURE : 

NAME /KEY : Modif ied-site 
LOCATION: 191 

OTHER INFORMATION: /note= "Xaa = Lys or Asn" 
FEATURE : 

NAME /KEY : Modif ied-site 
LOCATION: 4 63 

OTHER INFORMATION: /note= "Xaa = Val or Asp" 
FEATURE : 

NAME /KEY : Modif ied-site 
LOCATION: 47 9 

OTHER INFORMATION: /note= "Xaa = Val or Gly" 
US-09-330-740A-8 

Query Match 64.8%; Score 35; DB 3; Length 573; 

Best Local Similarity 55.6%; Pred. No. 1.6e+02; 

Matches 5; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

h I I I I 
Db 236 CDHEFHLRC 244 



RESULT 13 
US-09-004-838-11 

; Sequence 11, Application US/09004838 

; Patent No. 635 0933 

; GENERAL INFORMATION: 

APPLICANT: Michelmore, Richard W. 
APPLICANT: Shen, Kathy 
APPLICANT: Meyers, Blake 
; TITLE OF INVENTION: Procedures and Materials for 
; TITLE OF INVENTION: Conferring Pest Resistance in Plants 
NUMBER OF SEQUENCES: 14 0 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Townsend and Townsend and Crew LLP 

STREET: Two Embarcadero Center, Eighth Floor 
; CITY: San Francisco 

; STATE: California 

COUNTRY : USA 
ZIP: 94111-3834 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS -DOS 

SOFTWARE: Patent In Release #1.0, Version #1.3 0 



CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/ 004,838 

FILING DATE: 09-JAN-1998 

CLASSIFICATION: 800 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/781,734 

FILING DATE: 10-JAN-1997 
ATTORNEY/AGENT INFORMATION: 

NAME: Einhorn, Gregory P. 

REGISTRATION NUMBER: 38,44 0 

REFERENCE/DOCKET NUMBER: 023070-0788 10US 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (415) 576-0200 

TELEFAX: (415) 576-0300 
; INFORMATION FOR SEQ ID NO: 11: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 14 02 amino acids 

TYPE: amino acid 

STRANDEDNESS : 

TOPOLOGY : 1 inear 
MOLECULE TYPE: protein 
FEATURE : 

NAME/ KEY: 

LOCATION: 1. . 14 02 

OTHER INFORMATION: /note= " RLG1A amino acids" 
US-09-004-838-11 



Query Match 64 .8%; 

Best Local Similarity 66.7%; 
Matches 6; Conservative 

Qy 1 CNSRLHLRC 9 

Db 1264 CNSLEHCRC 1272 



Score 35; DB 4; Length 1402; 
Pred. No. 3.8e+02; 
0; Mismatches 3; Indels 0; Gaps 0; 



RESULT 14 
US-09-082-358B-88 

; Sequence 88, Application US/09082358B 

; Patent No. 6469153 

; GENERAL INFORMATION: 

; APPLICANT: Goff, Stephen P. 

; APPLICANT: Li, Xingquiang 

; TITLE OF INVENTION: EIP-1, EIP-3 GENES, ENVELOPE- INTERACTING PROTEINS, 
; TITLE OF INVENTION: EIP-1, and EIP-3 
; FILE REFERENCE: 0575/54804 

; CURRENT APPLICATION NUMBER: US/09/082 , 358B 
; CURRENT FILING DATE: 1998-05-20 
; NUMBER OF SEQ ID NOS : 106 
; SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 88 

LENGTH: 36 

TYPE: PRT 
; ORGANISM: murine 
US-09-082-358B-88 



Query Match 



63.0%; Score 34; DB 4; Length 36; 



Best Local Similarity 85.7%; Pred. No. 15; 
Matches 6; Conservative 0; Mismatches 

Qy 3 SRLHLRC 9 

II I I I I 
Db 3 SRTHLRC 9 



RESULT 15 
US-09-227-357-580 

; Sequence 580, Application US/09227357 

; Patent No. 6342581 

; GENERAL INFORMATION: 

; APPLICANT: Fischer et al . 

TITLE OF INVENTION: 123 Human Secreted Proteins 

; FILE REFERENCE: PZ010P1 

; CURRENT APPLICATION NUMBER: US/09/227 , 357 

; CURRENT FILING DATE: 1999-01-08 

; EARLIER APPLICATION NUMBER: PCT/US98 /13 684 

; EARLIER FILING DATE: 1998-07-07 

; EARLIER APPLICATION NUMBER: 60/051,926 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/052,793 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/051,925 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/051,929 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/052,803 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/052,732 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/051,931 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/051,932 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/051,916 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/051,930 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/051,918 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/051,920 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/052,733 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/052,795 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/051,919 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/051,928 

; EARLIER FILING DATE: 1997-07-08 

; EARLIER APPLICATION NUMBER: 60/055,722 

; EARLIER FILING DATE: 1997-08-18 

; EARLIER APPLICATION NUMBER: 60/055,723 

; EARLIER FILING DATE: 1997-08-18 

; EARLIER APPLICATION NUMBER: 60/055,948 



EARLIER FILING DATE: 1997-08-18 
EARLIER APPLICATION NUMBER: 60/055,949 
EARLIER FILING DATE: 1997-08-18 
EARLIER APPLICATION NUMBER: 60/055,953 
EARLIER FILING DATE: 1997-08-18 
EARLIER APPLICATION NUMBER: 60/055,950 
EARLIER FILING DATE: 1997-08-18 
EARLIER APPLICATION NUMBER: 60/055,947 
EARLIER FILING DATE: 1997-08-18 
EARLIER APPLICATION NUMBER: 60/055,964 
EARLIER FILING DATE: 1997-08-18 
EARLIER APPLICATION NUMBER: 60/056,360 
EARLIER FILING DATE: 1997-08-18 
EARLIER APPLICATION NUMBER: 60/055,684 
EARLIER FILING DATE: 1997-08-18 
EARLIER APPLICATION NUMBER: 60/055,984 
EARLIER FILING DATE: 1997-08-18 
EARLIER APPLICATION NUMBER: 60/055,954 
EARLIER FILING DATE: 1997-08-18 
EARLIER APPLICATION NUMBER: 60/058,785 
EARLIER FILING DATE: 1997-09-12 
EARLIER APPLICATION NUMBER : 60/058,664 
EARLIER FILING DATE: 1997-09-12 
EARLIER APPLICATION NUMBER: 60/058,660 
EARLIER FILING DATE: 1997-09-12 
EARLIER APPLICATION NUMBER: 60/058,661 
EARLIER FILING DATE: 1997-09-12 
NUMBER OF SEQ ID NOS : 672 
SOFTWARE: Patent In Ver, 2.0 
SEQ ID NO 580 
LENGTH: 4 0 
TYPE : PRT 

ORGANISM: Homo sapiens 
US-09-227-357-580 

Query Match 63.0%; Score 34; DB 4 ; Length 40; 

Best Local Similarity 44.4%; Pred. No. 17; 

Matches 4; Conservative 4; Mismatches 1; Indels 0; Gaps 
Qy 1 CNSRLHLRC 9 

Db 2 CVTRMHVKC 10 



Search completed: November 13, 2 003, 09:54:55 
Job time : 11.6875 sees 

GenCore version 5.1.6 
Copyright (c) 1993 - 2 003 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: November 13, 2003, 09:46:46 ; Search time 152.438 Seconds 

(without alignments) 
53.722 Million cell updates/sec 



GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: November 13, 2003, 09:31:40 ; Search time 30.2812 Seconds 

(without alignments) 
47.176 Million cell updates/sec 

Title: US- 09 -228 -866-1 

Perfect score: 54 

Sequence: 1 CNSRLHLRC 9 



Scoring table: BL0SUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 1107863 seqs, 158726573 residues 

Total number of hits satisfying chosen parameters: 1107863 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : A__Geneseq_19 Jun03 : * 

1 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1980 .DAT: * 

2 : /S I DS 1 / gcgda ta / genes eq/ genes eqp - embl / AA1 981. DAT : * 

3 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1982 .DAT:* 

4 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1983 .DAT:* 

5 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1984 .DAT: * 

6 : / SIDSl/gcgdata/geneseq/ geneseqp-embl/AA1985 . DAT : * 

7 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1986 .DAT:* 

8 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1987 .DAT: * 

9 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1988 .DAT: * 
10 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1989 .DAT: * 
11 : /SIDSl/gcgdata/ geneseq/ geneseqp -embl /AA1 990 . DAT : * 
12 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1991 .DAT: * 
13 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1992 .DAT: * 
14 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1993 .DAT: * 
15 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1994 .DAT: * 
16 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1995 .DAT: * 
17 : / SIDSl/gcgdata/geneseq/geneseqp-embl/AA1996 . DAT: * 
18 : /SIDSl/ gcgda ta/ geneseq/ geneseqp -embl /AA1 9 97 . DAT : * 
19 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1998 .DAT: * 
20 : / SIDSl/ gcgdata/ geneseq/ geneseqp-embl/AA1999 . DAT : * 
21 : /SIDSl/gcgdata/ geneseq/geneseqp-embl/AA2 000 . DAT: * 
22 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA2001 .DAT: * 
23 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA2002 .DAT: * 
24 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA2003 .DAT: * 

Pred. No. is the number of results predicted by chance to have a 

score greater than or equal to the score of the result being printed, 



and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 
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AAY60189 


Human pn d nm p t"r* i nm 


25 


36 


66 , 


. 7 


93 


22 


ABG26573 


Wnvpl human diaann 
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ABB65114 
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AAU19470 


Human rH acrnnQl - i f a 
1 1 Lxiuciix uxa^iiusoiL ct 
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23 


ABP68968 


Human polypeptide 
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36 


66. 


7 
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22 


ABB63422 


Drosophila melanog 
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35 
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8 


51 


23 


ABP32783 


Human ORF1756 prot 
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8 
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22 


AAB95889 


Human protein sequ 
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8 
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21 


AAG36307 
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8 
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21 
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8 
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21 
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8 
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AAG36578 
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35 
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8 
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22 


ABG2 6084 


Novel human diagno 
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35 
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8 
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18 


AAW55451 


H. pylori ORF 02ae 
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35 


64. 


8 
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19 


AAW98321 


H. pylori GHPO 134 
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35 


64. 


8 
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19 


AAW71512 


Helicobacter polyp 


44 


35 


64 . 


8 
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21 


AAG13847 


Arabidopsis thalia 


45 


35 


64 . 


8 
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21 


AAG53803 


Arabidops is thalia 



ALIGNMENTS 



RESULT 1 
AAW13410 

ID AAW13410 standard; Peptide; 9 AA. 
XX 

AC AAW13410; 
XX 

DT 15-JAN-1998 (first entry) 
XX 

DE Brain homing peptide. 
XX 

KW Brain homing peptide; in vivo panning; screening; phage display. 
XX 

OS Synthetic. 
XX 

PN WO9710507-A1 . 
XX 

PD 20-MAR-1997. 
XX 

PF 10-SEP-1996; 96WO-US14600 . 
XX 

PR ll-SEP-1995; 95US-0526710 . 

PR ll-SEP-1995; 95US- 05267 08 . 
XX 

PA (LJOL-) LA JOLLA CANCER RES FOUND . 
XX 

PI Pasqualini R, Ruoslahti E; 
XX 

DR WPI; 1997-202359/18. 
XX 

PT Obtaining compound that homes to selected organ or tissue - by in 

PT vivo panning method, specifically to identify brain, kidney, 

PT angiogenic vasculature or tumour tissue homing peptide (s) 
XX 

PS Claim 11; Page 67; 75pp; English. 
XX 

CC This synthetic peptide is a claimed example of a brain-homing 

CC peptide that was identified using a novel method for obtaining 

CC molecules that home to a selected organ or tissue. This in vivo 

CC panning method typically involves administering a phage display 

CC library to a subject, and identifying expressed peptides which 

CC home to the desired organ or tissue, e.g. brain, kidney, angiogenic 

CC vascular tissue or tumour tissue. The isolated peptides (see 

CC AAW13412-52, AAW11181-86) can be used to target e.g. drugs, toxins or 

CC labels to the selected organ/tissue (claimed) or to identify and/or 

CC isolate target molecules (claimed) . The peptides can be directly 

CC identified in vivo, as compared to prior art in vitro screening 

CC methods, which require further examination to see if they maintain 

CC specificity in vivo. 

XX 

SQ Sequence 9 AA; 

Query Match 10 0.0%; Score 54; DB 18; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 
Qy 1 CNSRLHLRC 9 



Db 



1 CNSRLHLRC 9 



RESULT 2 
AAB07387 

ID AAB07387 standard; peptide; 9 AA. 
XX 

AC AAB07387; 
XX 

DT 17-OCT-2000 (first entry) 
XX 

DE Brain homing peptide # 1. 
XX 

KW Brain; homing peptide; organ targeting; tissue targeting; mouse; cyclic. 
XX 

OS Mus sp. 
XX 

FH Key Location/Qualifiers 
FT Disulf ide-bond 1..9 

FT /note= "Can optionally form a cyclic peptide" 

XX 

PN US6068829-A. 
XX 

PD 30-MAY-2000. 
XX 

PF 23-JUN-1997; 97US-08 62 8 55 . 
XX 

PR 11 -SEP- 1995; 95US- 052 6710 . 
PR 10 -MAR- 19 97; 97US- 08 13273 . 
XX 

PA (BURN-) BURNHAM INST. 
XX 

PI Pasqualini R, Ruoslahti E; 
XX 

DR WPI; 2000-410850/35. 
XX 

PT Identifying and recovering organ homing molecules or peptides by in 
PT vivo panning comprises administering a library of diverse peptides 
PT linked to a tag which facilitates recovery of these peptides 
XX 

PS Example 2; Column 17; 2 0pp; English. 
XX 

CC The present sequence is a mouse brain homing peptide. This sequence was 

CC identified by using in vivo panning to screen a library of potential 

CC organ homing molecules. The present sequence can be used to direct a 

CC moiety to a the brain tissue, by linking the moiety to the present 

CC sequence. Examples of potential moieties are drugs, toxins or a 

CC detectable label. The present sequence contains a SRL amino acid motif. 

XX 

SQ Sequence 9 AA; 

Query Match 100.0%; Score 54; DB 21; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CNSRLHLRC 9 

MINIMI 



Db 



1 CNSRLHLRC 9 



RESULT 3 
AAE11793 

ID AAE11793 standard; peptide; 9 AA. 
XX 

AC AAE117 93; 
XX 

DT 18-DEC-2001 (first entry) 
XX 

DE Phage peptide #1 targetted to brain. 
XX 

KW Enriched library fraction; brain; kidney; tumour; panning; diagnostic; 

KW molecular medicine; drug delivery; peptidomimetic ; pharmaceutical. 
XX 

OS Bacteriophage. 
XX 

FH Key Location/Qualifiers 

FT Domain 3 . . 5 

FT /label= SRLjnotif 

XX 

PN US6296832-B1. 
XX 

PD 02-OCT-2001 . 
XX 

PF 08-JAN-1999; 99US- 0226985 . 
XX 

PR 23-JUN-1997; 97US- 0862855 . 

PR ll-SEP-1995; 95US-0526710 . 

PR 10-MAR-1997; 97US- 08 13273 . 
XX 

PA (BURN-) BURNHAM INST. 
XX 

PI Ruoslahti E, Pasqualini R; 
XX 

DR WPI; 2001-610691/70. 
XX 

PT Enriched library fraction comprising molecules recovered by in vivo 

PT panning that selectively home to a selected organ or tissue useful for 

PT treating disease or in diagnostic methods 
XX 

PS Example 2; Column 17; 21pp; English. 
XX 

CC The invention relates to an enriched library fraction containing 

CC molecules that selectively home to a selected organ or tissue such as 

CC brain, kidney or tumour recovered by in vivo panning. The invention 

CC generally relates to the field of molecular medicine, drug delivery and 

CC to a method of invivo panning for identifying a molecule that homes to a 

CC specific organ. The molecules, e.g., peptides, peptidomimet ics , proteins 

CC and fragments of proteins contained in an enriched library fraction may 

CC be administered to a subject as part of a pharmaceutical composition to 

CC treat disease or in diagnostic methods. The present sequence is a 

CC peptide from bacteriophage targetted to brain. 

XX 

SQ Sequence 9 AA; 



Query Match 100.0%; Score 54; DB 22; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 



Qy 1 CNSRLHLRC 9 

Db 1 CNSRLHLRC 9 

RESULT 4 
AAU10704 

ID AAU10704 standard; peptide; 9 AA. 
XX 

AC AAU10704; 
XX 

DT 12-MAR-2002 (first entry) 
XX 

DE Brain homing peptide #1 useful for delivery of target molecules. 
XX 

KW Organ targeting; tissue targeting; cancer; tumour homing molecule; 

KW delivery of target molecule; brain homing peptide. 

XX 

OS Synthetic . 
XX 

PN US6306365-B1. 
XX 

PD 23-OCT-2001. 
XX 

PF 08-JAN-1999; 9 9US- 0227906 . 
XX 

PR 23-JUN-1997; 97US- 0862855 . 

PR ll-SEP-1995; 95US- 052 671 0 . 

PR 10-MAR-1997; 97US- 08 13273 . 
XX 

PA (BURN-) BURNHAM INST . 
XX 

PI Ruoslahti E, Pasqualini R; 
XX 

DR WPI; 2002-040196/05. 
XX 

PT Recovering molecules that home to an organ or tissue, useful for 

PT identifying molecules that home to a specific organ or tissue, e.g. 

PT identifying a tumour homing molecule to identify the presence of cancer, 

PT by in vivo panning of a library - 

XX 

PS Example 2; Column 17; 21pp; English. 
XX 

CC The present invention relates to a method of recovering molecules that 

CC home to a selected organ or tissue. The method comprises administering 

CC to the subject the library of diverse molecules, collecting a sample of 

CC the selected organ or tissue (e.g. brain or kidney) , and recovering from 

CC the sample several molecules that home to the selected organ or tissue. 

CC The method is useful for identifying molecules, particularly useful for 

CC screening large number of molecules (e.g. peptides) , that home to a 

CC specific organ. The identified molecule is useful for e.g. raising an 

CC antibody specific for a target molecule, targeting a desired moiety 

CC (e.g. drug, toxin or detectable label) to the selected organ. 



CC Specifically, the method is useful for identifying the presence of cancer 

CC in a subject by linking an appropriate moiety to a tumour homing 

CC molecule. The present method provides a direct means for identifying 

CC molecules that specifically home to a selected organ and, therefore 

CC provides a significant advantage over previous methods, which require 

CC that a molecule identified using an in vitro screening method 

CC subsequently be examined to determine if it maintains its specificity in 

CC vivo. AAU10704-AAU10723 represent brain homing peptides described in 

CC the present invention. 

XX 

SQ Sequence 9 AA; 



Query Match 100.0%; Score 54; DB 23; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

MINIMI 
Db 1 CNSRLHLRC 9 



RESULT 5 


ABU59529 


ID 


ABU5 9529 standard; Peptide; 9 AA. 


XX 




AC 


ABU5 952 9; 


XX 




JJi 


^z-APR-2003 (first entry) 


XX 




DE 


Brain receptor targeting peptide #1. 


XX 




KW 


Targeting ligand; bioactive agent; polymer matrix; cancer; cytostatic; 


KW 


cathepsin-D substrate; peptides; neuroreceptor; adrenal receptor; 


KW 


fibronectin; vitronectin; integrin; RGD motif; angiogenic endothelium; 


KW 


tumour; cationic cancer- targeting peptide. 


XX 




OS 


Synthetic . 


XX 




PN 


US2002041898-A1. 


XX 




PD 


ll-APR-2002 . 


XX 




PF 


25-JUL-2001; 2001US-0912609 . 


XX 




PR 


05-JAN-2 000; 2000US- 0478 124 . 


PR 


31-OCT-2000; 2 000US- 0703474 . 


XX 




PA 


(UNGE/) UNGER E C. 


PA 


(MATS/) MATSUNAGA T 0. 


PA 


(RAMA/) RAMASWAMI V. 


PA 


(ROMA/) ROMANOWSKI M J. 


XX 




PI 


Unger EC, Matsunaga TO, Ramaswami V, Romanowski MJ; 


XX 




DR 


WPI; 2003-208921/20. 


XX 




PT 


Targeted delivery system comprising a bioactive agent homogeneously 



PT dispersed in a targeted matrix is especially useful in cancer therapy 

PT 

XX 

PS Claim 23; Page 37; 46pp; English. 
XX 

CC The invention relates to a composition comprising a bioactive agent 

CC homogeneously dispersed in a targeted matrix (polymer and targeting 

CC ligand) . Also included are a targeted matrix for use as a delivery 

CC vehicle comprising a polymer associated with a targeting ligand, 

CC enhancing the bioavailability of an agent comprising administration 

CC of the composition and treating cancer comprising administration of the 

CC novel composition. The method is useful for targeted delivery of a drug, 

CC especially in cancer therapy. The targeting ligand may be a peptide. 

CC Examples of targeting peptides are disclosed including cathepsin-D 

CC substrate peptides, peptides targeting receptors in the brain and 

CC kidney, peptides recognising fibronectin- and vitronectin-binding 

CC integrins, peptides targeting the RGD (Arg-Gly-Asp) -motif in, e.g., 

CC antibodies, peptides targeting the angiogenic endothelium of solid 

CC tumours, tissue specific peptides (e.g. of lung, skin, pancreas, 

CC intestine, uterus, adrenal gland and retina), and cationic cancer- 

CC targeting peptides. The present sequence is a peptide targeting 

CC ligand disclosed in the invention. 

XX 

SQ Sequence 9 AA; 

Query Match 100.0%; Score 54; DB 24; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CNSRLHLRC 9 

MINIMI 

Db 1 CNSRLHLRC 9 



RESULT 6 
AAW13411 

ID AAW13411 standard; Peptide; 9 AA. 
XX 

AC AAW13411; 
XX 

DT 15-JAN-1998 (first entry) 
XX 

DE Brain homing peptide. 
XX 

KW Brain homing peptide; in vivo panning; screening; phage display; 

KW drug delivery. 

XX 

OS Synthetic. 
XX 

PN WO9710507-A1 . 
XX 

PD 20-MAR-1997. 
XX 

PF 10-SEP-1996; 96WOUS14600 . 
XX 

PR ll-SEP-1995; 95US-0526710 . 

PR ll-SEP-1995; 95US-0526708 . 



XX 

PA (LJOL-) LA JOLLA CANCER RES FOUND. 
XX 

PI Pasqualini R, Ruoslahti E; 
XX 

DR WPI; 1997-202359/18. 
XX 

PT Obtaining compound that homes to selected organ or tissue - by in 

PT vivo panning method, specifically to identify brain, kidney, 

PT angiogenic vasculature or tumour tissue homing peptide (s) 
XX 

PS Claim 11; Page 67; 75pp; English. 
XX 

CC This synthetic peptide is a claimed example of a brain-homing 

CC peptide that was identified using a novel method for obtaining 

CC molecules that home to a selected organ or tissue. This in vivo 

CC panning method typically involves administering a phage display 

CC library to a subject, and identifying expressed peptides which 

CC home to the desired organ or tissue, e.g. brain, kidney, angiogenic 

CC vascular tissue or tumour tissue. The isolated peptides (see 

CC AAW13412-52, AAW11181-86) can be used to target e.g. drugs, toxins or 

CC labels to the selected organ/tissue (claimed) or to identify and/or 

CC isolate target molecules (claimed) . The peptides can be directly 

CC identified in vivo, as compared to prior art in vitro screening 

CC methods, which require further examination to see if they maintain 

CC specificity in vivo. 

XX 

SQ Sequence 9 AA; 

Query Match 85.2%; Score 46; DB 18; Length 9; 

Best Local Similarity 88.9%; Pred. No. 9.3e+05; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

Mill III 
Db 1 CNSRLQLRC 9 

RESULT 7 
AAB07391 

ID AAB07391 standard; peptide; 9 AA. 
XX 

AC AAB07391; 
XX 

DT 17-OCT-2000 (first entry) 
XX 

DE Brain homing peptide # 5. 
XX 

KW Brain; homing peptide; organ targeting; tissue targeting; mouse; cyclic. 
XX 

OS Mus sp. 
XX 

FH Key Location/Qualifiers 

FT Disulf ide-bond 1..9 

FT /note= "Can optionally form a cyclic peptide" 

XX 

PN US6068829-A. 



XX 

PD 30-MAY-2000- 
XX 

PF 23-JUN-1997; 97US-0862855 . 
XX 

PR ll-SEP-1995; 95US - 052 6710 . 

PR 10-MAR-1997; 97US - 08 132 73 . 
XX 

PA ' (BURN-) BURNHAM INST. 
XX 

PI Pasqualini R, Ruoslahti E; 
XX 

DR WPI; 2000-410850/35. 
XX 

PT Identifying and recovering organ homing molecules or peptides by in 

PT vivo panning comprises administering a library of diverse peptides 

PT linked to a tag which facilitates recovery of these peptides 
XX 

PS Example 2; Column 17; 20pp; English. 
XX 

CC The present sequence is a mouse brain homing peptide. This sequence was 

CC identified by using in vivo panning to screen a library of potential 

CC organ homing molecules. The present sequence can be used to direct a 

CC moiety to a the brain tissue, by linking the moiety to the present 

CC sequence. Examples of potential moieties are drugs, toxins or a 

CC detectable label. The present sequence contains a SRL amino acid motif. 

XX 

SQ Sequence 9 AA; 

Query Match 85.2%; Score 46; DB 21; Length 9; 

Best Local Similarity 88.9%; Pred. No. 9.3e+05; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 
Qy 1 CNSRLHLRC 9 

Db 1 CNSRLQLRC 9 



RESULT 8 
AAE11797 

ID AAE11797 standard; peptide; 9 AA. 
XX 

AC AAE11797; 
XX 

DT 18-DEC-2001 (first entry) 
XX 

DE Phage peptide #5 targetted to brain. 
XX 

ICW Enriched library fraction; brain; kidney; tumour; panning; diagnostic; 
KW molecular medicine; drug delivery; peptidomimetic; pharmaceutical. 
XX 

OS Bacteriophage. 
XX 

FH Key Location/Qualifiers 

FT Domain 3 . . 5 

FT /label = SRL_motif 

XX 



PN US6296832-B1 . 
XX 

PD 02-OCT-2001. 
XX 

PF 08-JAN-1999; 9 9US- 022698 5 . 
XX 

PR 23-JUN-1997; 97US-0862855 . 

PR ll-SEP-1995; 95US- 05267 10 . 

PR 10-MAR-1997; 97US- 08 13273 . 
XX 

PA (BURN-) BURN HAM INST. 
XX 

PI Ruoslahti E, Pasqualini R; 
XX 

DR WPI; 2001-610691/70. 
XX 

PT Enriched library fraction comprising molecules recovered by in vivo 

PT panning that selectively home to a selected organ or tissue useful for 

PT treating disease or in diagnostic methods 
XX 

PS Example 2; Column 17; 21pp; English. 
XX 

CC The invention relates to an enriched library fraction containing 

CC molecules that selectively home to a selected organ or tissue such as 

CC brain, kidney or tumour recovered by in vivo panning. The invention 

CC generally relates to the field of molecular medicine, drug delivery and 

CC to a method of invivo panning for identifying a molecule that homes to a 

CC specific organ. The molecules, e.g., peptides, peptidomimetics , proteins 

CC and fragments of proteins contained in an enriched library fraction may 

CC be administered to a subject as part of a pharmaceutical composition to 

CC treat disease or in diagnostic methods. The present sequence is a 

CC peptide from bacteriophage targetted to brain. 

XX 

SQ Sequence 9 AA; 

Query Match 85.2%; Score 46; DB 22; Length 9; 
Best Local Similarity 88.9%; Pred. No. 9.3e+05; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

Mill III 

Db 1 CNSRLQLRC 9 



RESULT 9 
AAU10708 

ID AAU10708 standard; peptide; 9 AA. 
XX 

AC AAU10708; 
XX 

DT 12-MAR-2002 (first entry) 
XX 

DE Brain homing peptide #5 useful for delivery of target molecules. 
XX 

KW Organ targeting; tissue targeting; cancer; tumour homing molecule; 

KW delivery of target molecule; brain homing peptide. 

XX 



OS Synthetic. 
XX 

PN US6306365-B1. 
XX 

PD 23-OCT-2001. 
XX 

PF 08-JAN-1999; 99US- 0227906 . 
XX 

PR 23-JUN-1997; 97US- 0862855 . 

PR ll-SEP-1995; 95US- 0526710 . 

PR 10-MAR-1997; 97US- 08 13273 . 
XX 

PA (BURN-) BURNHAM INST. 
XX 

PI Ruoslahti E, Pasqualini R; 
XX 

DR WPI; 2002-040196/05. 
XX 

PT Recovering molecules that home to an organ or tissue, useful for 

PT identifying molecules that home to a specific organ or tissue, e.g. 

PT identifying a tumour homing molecule to identify the presence of cancer, 

PT by in vivo panning of a library - 

XX 

PS Example 2; Column 17; 21pp; English. 
XX 

CC The present invention relates to a method of recovering molecules that 

CC home to a selected organ or tissue. The method comprises administering 

CC to the subject the library of diverse molecules, collecting a sample of 

CC the selected organ or tissue (e.g. brain or kidney) , and recovering from 

CC the sample several molecules that home to the selected organ or tissue. 

CC The method is useful for identifying molecules, particularly useful for 

CC screening large number of molecules (e.g. peptides) , that home to a 

CC specific organ. The identified molecule is useful for e.g. raising an 

CC antibody specific for a target molecule, targeting a desired moiety 

CC (e.g. drug, toxin or detectable label) to the selected organ. 

CC Specifically, the method is useful for identifying the presence of cancer 

CC in a subject by linking an appropriate moiety to a tumour homing 

CC molecule. The present method provides a direct means for identifying 

CC molecules that specifically home to a selected organ and, therefore 

CC provides a significant advantage over previous methods, which require 

CC that a molecule identified using an in vitro screening method 

CC subsequently be examined to determine if it maintains its specificity in 

CC vivo. AAU10704-AAU10723 represent brain homing peptides described in 

CC the present invention. 

XX 

SQ Sequence 9 AA; 

Query Match 85.2%; Score 46; DB 23; Length 9; 

Best Local Similarity 88.9%; Pred. No. 9.3e+05; 

Matches 8; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

Mill III 
Db 1 CNSRLQLRC 9 



RESULT 10 



ABG22110 

ID ABG22110 standard; Protein; 341 AA. 
XX 

AC ABG22110; 
XX 

DT 18-FEB-2002 (first entry) 
XX 

DE Novel human diagnostic protein #22101. 
XX 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder. 
XX 

OS Homo sapiens. 
XX 

PN WO200175067-A2 . 
XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2 0 01WO-US0863 1 . 
XX 

PR 31-MAR-2000; 2 0 00US- 054 0217 . 

PR 23-AUG-2000; 2 000US- 064 9167 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR N-PSDB; AAS86297. 
XX 

PT New isolated polynucleotide and encoded polypeptides, useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity 
XX 

PS Claim 20; SEQ ID No 52469; 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and 

CC polypeptide (II) sequences. (I) is useful as hybridisation probes, 

CC polymerase chain reaction (PCR) primers, oligomers, and for chromosome 

CC and gene mapping, and in recombinant production of (II) . The 

CC polynucleotides are also used in diagnostics as expressed sequence tags 

CC for identifying expressed genes. (I) is useful in gene therapy techniques 

CC to restore normal activity of (II) or to treat disease states involving 

CC (II) . (II) is useful for generating antibodies against it, detecting or 

CC quantitating a polypeptide in tissue, as molecular weight markers and as 

CC a food supplement. (II) and its binding partners are useful in medical 

CC imaging of sites expressing (II). (I) and (II) are useful for treating 

CC disorders involving aberrant protein expression or biological activity. 

CC The polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. ABG00010-ABG3 03 77 represent novel human 

CC diagnostic amino acid sequences of the invention. 

CC Note: The sequence data for this patent did not appear in the printed 

CC specification, but was obtained in electronic format directly from WIPO 



CC at ftp.wipo. int/pub/published_pct__sequences . 
XX 

SQ Sequence 341 AA; 

Query Match 74.1%; Score 40; DB 22; Length 341; 
Best Local Similarity 66.7%; Pred. No. 65; 

Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

|: Mill 

Db 15 0 CSGHLHLRC 158 

RESULT 11 
AAY70474 

ID AAY70474 standard; Protein; 1327 AA. 
XX 

AC AAY70474; 
XX 

DT 04-JUL-2000 (first entry) 
XX 

DE Human cyclic nucleot ide-associated protein-2 (CNAP-2) . 
XX 

KW Cyclic nucleot ide-associated protein-2; CNAP-2; human; cytostatic; 

KW ant i -arteriosclerotic ; hepatotropic; anti-leukaemic; ant i- inflammatory; 

KW immunomodulatory; ant i -asthmatic ; ant i -anaemic ; ant i- diabetic; diagnosis; 

KW anti- sclerotic; derma t ol ogi ca 1 ; neuroprotective; ant i- epileptic; cancer; 

KW anti -Alzheimer * s ; anti -Parkinsonian; cerebroprotective; ophthalmological ; 

KW anti-infertility; ant i-allergic ; vasotropic; immunosuppressive; 

KW hypotensive; gene therapy; prevention; treatment; arteriosclerosis; 

KW cell proliferative disorder; autoimmune/ inflammatory; diabetes mellitus; 

KW neurological; vision; reproductive; smooth muscle. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT Peptide 1. .34 

FT /labels Signal__pept ide 

FT Protein 35.. 1327 

FT /labels Mature_CNAP-2 

FT /note- "Shares 24% identity to Aquifex pyrophilus 

FT esterase 28LC" 

FT Modif ied-site 68 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 1225 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 73 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 125 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 220 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 326 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 357 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 386 



FT /note= "Potential phosphorylation site" 

FT Modif ied-site 400 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 432 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 455 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 560 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 600 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 780 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 784 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 997 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 1113 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 1121 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 1171 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 1251 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 1274 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 12 85 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 12 9 9 

FT /note^ "Potential phosphorylation site" 

FT Modif ied-site 1301 

FT /note- "Potential phosphorylation site" 

FT Modif ied-site 1323 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 82 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 236 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 319 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 547 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 634 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 699 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 816 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 8 94 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 910 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 122 0 

FT /note= "Potential phosphorylation site" 

FT Modif ied-site 123 0 

FT /note= "Potential phosphorylation site" 



FT Modif ied-site 392 

FT /note* "Potential phosphorylation site" 

FT Modif ied-site 1019 

FT /note= "N-glycosylated" 

FT Modif ied-site 1040 

FT /note= "N-glycosylated" 

FT Modif ied-site 1228 

FT /note= "N-glycosylated" 

FT Binding-site 144.. 269 

FT /label = cNMP-binding__domain 

FT Binding-site 573.. 696 

FT /label= cNMP-binding_domain 

FT Domain 10. .30 

FT /label = Transmembrane_domain 

FT Region 605 . . 628 

FT /note= "Resembles cyclic-nucleot ide binding domain 

FT proteins" 

FT Region 643 . .676 

FT /note= "Resembles cyclic-nucleot ide binding domain 

FT proteins" 

XX 

PN WO200014248-A1. 
XX 

PD 16-MAR-2000. 
XX 

PF 03-SEP-1999; 9 9WO-US2 028 7 , 
XX 

PR 04-SEP-1998; 9 8US- 0148 9 04 . 
XX 

PA (INCY-) INCYTE PHARM INC. 
XX 

PI Hillman JL, Yue H, Guegler KJ # Corley NC, Patterson C, Tang YT; 
XX 

DR WPI; 2000-256994/22. 

DR N-PSDB; AAZ51683 . 
XX 

PT Isolated cyclic nucleotide associated proteins useful for preventing, 

PT diagnosing and treating cell proliferative, autoimmune/ inflammatory, 

PT neurological, vision, reproductive and smooth muscle disorders - 
XX 

PS Disclosure; Page 64-67; 78pp; English. 
XX 

CC The present sequence is a human cyclic nucleotide- 

CC associated protein-2 (CNAP-2) , identified in Incyte clone 3149674, 

CC that is isolated from ADRENON04 cDNA library. It is expressed in 

CC nervous, reproductive, cardiovascular and haematopoietic/immune tissues. 

CC CNAP sequences may be used for prevention, treatment and diagnosis of 

CC diseases associated with altered CNAP expression such as, cell 

CC proliferative disorders (e.g. arteriosclerosis, cirrhosis, leukaemia, 

CC lymphoma and cancer of the breast, prostate, lung and brain) , autoimmune/ 

CC inflammatory disorders (e.g. asthma, anaemia, diabetes mellitus, multiple 

CC sclerosis and psoriasis), neurological disorders (e.g. epilepsy, 

CC Alzheimer ' s /Parkinson' s disease and strokes), vision disorders (e.g. 

CC conjunctivitis, glaucoma, cataracts and retinitis pigmentosa), 

CC reproductive disorders (e.g. infertility, uterine fibroids, ectopic 

CC pregnancies and impotence) and smooth muscle disorders (e.g. angina, 

CC anaphylactic shock, Kearns-Sayre syndrome and hypertension) . The 



CC coding sequence can be used for gene therapy. 
XX 

SQ Sequence 1327 AA; 

Query Match 74.1%; Score 40; DB 21; Length 1327; 

Best Local Similarity 66.7%; Pred. No. 2.2e+02; 

Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 
Qy 1 CNSRLHLRC 9 

Db 880 CSGHLHLRC 888 



RESULT 12 
ABB58383 

ID ABB58383 standard; Protein; 1091 AA. 
XX 

AC ABB58383; 
XX 

DT 26-MAR-2002 (first entry) 
XX 

DE Drosophila melanogaster polypeptide SEQ ID NO 1941. 
XX 

KW Drosophila; developmental biology; cell signalling; insecticide; 

KW pharmaceutical. 

XX 

OS Drosophila melanogaster. 
XX 

PN WO200171042-A2. 
XX 

PD 27-SEP-2001. 
XX 

PF 23-MAR-2001; 2001WO-US0 923 1 . 
XX 

PR 23-MAR-2000; 20 00US-19 163 7P . 
PR ll-JUL-2000; 20 00US -06 14 150 . 
XX 

PA (PEKE ) PE CORP NY. 
XX 

PI Venter JC, Adams M, Li PWD, Myers EW; 
XX 

DR WPI; 2001-656860/75. 
DR N-PSDB; ABL02486. 
XX 

PT New isolated nucleic acid detection reagent for detecting 1000 or more 
PT genes from Drosophila and for elucidating cell signalling and cell-cell 
PT interactions - 
XX 

PS Disclosure; SEQ ID NO 1941; 21pp + Sequence Listing; English. 
XX 

CC The invention relates to an isolated nucleic acid detection reagent 
CC capable of detecting 1000 or more genes from Drosophila. The invention 
CC useful in developmental biology and in elucidating cell signalling and 
CC cell -cell interactions in higher eukaryotes for the development of 
CC insecticides, therapeutics and pharmaceutical drugs. The invention 
CC discloses genomic DNA sequences (ABL16176-ABL30511) , expressed DNA 
CC sequences (ABL01840-ABL16175) and the encoded proteins 



CC (ABB57737-ABB72072) . 

CC The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp. wipo. int/pub/published_pct_sequences . 

XX 

SQ Sequence 1091 AA; 

Query Match 72.2%; Score 39; DB 22; Length 1091; 

Best Local Similarity 87.5%; Pred. No. 2.7e+02; 

Matches 7; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 NSRLHLRC 9 

I II 

Db 3 07 NFRLHLRC 314 



RESULT 13 
AAG164 91 

ID AAG16491 standard; Protein; 167 AA. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 



AAG16491; 

17-OCT-2000 (first entry) 

Arabidopsis thaliana protein fragment SEQ ID NO: 17157. 

Protein identification; signal transduction pathway; metabolic pathway; 
hybridisation assay; genetic mapping; gene expression control; promoter ; 
termination sequence. 

Arabidopsis thaliana. 

EP1033405-A2. 

06-SEP-2000. 

25-FEB-2000; 2 0 0 0EP- 03 0 143 9 . 



25-FEB-1999 

05- MAR-1999 
09-MAR-1999 
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Query Match 7 0.4%; Score 38; DB 21; Length 199; 

Best Local Similarity 66.7%; Pred. No. 86; 

Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 
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I I hi II 
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Search completed; November 13, 2003, 09:45:21 
Job time : 31.2812 sees 

GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



November 13, 2003, 09:45:35 ; Search time 18.6562 Seconds 

(without alignments) 
88.069 Million cell updates/sec 

US-09-228-866-1 
54 

1 CNSRLHLRC 9 
BLOSUM62 

Gapop 10.0 , Gapext 0.5 

666188 seqs, 182559486 residues 



Total number of hits satisfying chosen parameters 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



666188 



Database 



Published_Applications_AA: * 



1 : /cgn2_6/ptodata/2/pubpaa/US07_PUBCOMB .pep: * 

2 : /cgn2__6/ptodata/2/pubpaa/PCT_NEW__PUB.pep: * 

3 : /cgn2_6/ptodata/2/pubpaa/US06_NEW_PUB.pep:* 

4 : /cgn2_6/ptodata/2/pubpaa/US06_PUBCOMB.pep: * 

5 : /cgn2_6/ptodata/2/pubpaa/US07_NEW_PUB .pep : * 
6: /cgn2_6/ptodata/2/pubpaa/PCTUS__PUBCOMB.pep: * 
7 : /cgn2_6/ptodata/2/pubpaa/US08__NEW_PUB.pep: * 

8 : /cgn2_6/ptodata/2/pubpaa/US08_PUBCOMB .pep : * 

9 : /cgn2_6/ptodata/2/pubpaa/US09A_PUBCOMB .pep : * 
10 : /cgn2_6/ptodata/2/pubpaa/US09B_PUBCOMB.pep: * 
11: / cgn2 _6 /p t oda t a / 2 /pubpa a / US 0 9 C_PUB COMB . p ep : * 
12 : /cgn2__6/ptodata/2/pubpaa/US09_NEW_PUB.pep: * 
13: / cgn2_6 /pt oda t a / 2 /pubpaa /US 1 0 A_PUBCOMB . pep : * 
14 : /cgn2_6/ptodata/2/pubpaa/US10B_PUBCOMB.pep: * 
15: /cgn2_6/ptodata/2/pubpaa/US10C_PUBCOMB.pep:* 
16 : /cgn2_6/ptodata/2/pubpaa/US10_NEW_PUB.pep: * 
17 : /cgn2_6/ptodata/2/pubpaa/US60_NEW_PUB.pep: * 
18 : /cgn2_6/ptodata/2/pubpaa/US60_PUBCOMB.pep:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 

US-10-306-878-11 

; Sequence 11, Application US/10306878 

; Publication No. US20030175819A1 

; GENERAL INFORMATION: 

; APPLICANT: Reed, John C. 

; APPLICANT: Guo, Bin 

; TITLE OF INVENTION: Methods for Identifying Modulators of 
; TITLE OF INVENTION: ApoptOSiS 
; FILE REFERENCE: P-LJ 5535 

; CURRENT APPLICATION NUMBER: US/10/306 , 878 

; CURRENT FILING DATE: 2002-11-27 

; PRIOR APPLICATION NUMBER: US 60/334,14 9 

; PRIOR FILING DATE: 2001-11-28 

; NUMBER OF SEQ ID NOS : 28 

SOFTWARE: Fast SEQ for Windows Version 4.0 
; SEQ ID NO 11 

LENGTH: 9 

TYPE : PRT 

ORGANISM: Artificial Sequence 
FEATURE : 

OTHER INFORMATION: Synthetic construct 
US-10-3Q6-878-11 

Query Match 100.0%; Score 54; DB 12; Length 9; 

Best Local Similarity 100.0%; Pred. No. 6e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CNSRLHLRC 9 



Db 



1 CNSRLHLRC 9 



RESULT 2 

US-09-881-752A-242 

; Sequence 242, Application US/09881752A 



Patent No. US20020115078A1 
GENERAL INFORMATION : 

APPLICANT: Kleanthous, Harold 

APPLICANT: Al-Garawi, Amal 

APPLICANT: Miller, Charles 

APPLICANT: Tomb, Jean-Francois 

APPLICANT: Oomen, Raymond P. 

TITLE OF INVENTION: Identification of Polynucleotides 

TITLE OF INVENTION: Encoding No. US20020115078Alel Helicobacter Polypeptides 
in the Helicobacter 

TITLE OF INVENTION: Genome 
FILE REFERENCE: 06132/041002 
CURRENT APPLICATION NUMBER: US/09/881 , 752A 
CURRENT FILING DATE: 2001-06-15 
PRIOR APPLICATION NUMBER: US 08/833,457 
PRIOR FILING DATE: 1997-04-01 
NUMBER OF SEQ ID NOS : 37 0 
SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 242 
LENGTH: 3 06 
TYPE : PRT 

ORGANISM: Helicobacter pylori 
US-09-881-752A-242 



Query Match 64.8%; 
Best Local Similarity 66.7%; 
Matches 6; Conservative 



Score 35; DB 10; Length 3 06; 
Pred. No. 2.6e+02; 
0; Mismatches 3; Indels 



0 ; Gaps 



0; 



Qy 

Db 



1 CNSRLHLRC 9 

II I II I 
91 CNLRNHLAC 99 



RESULT 3 

US-09-988-067B-78 

Sequence 78, Application US/09988067B 
Publication No. US20030124141A1 
GENERAL INFORMATION: 
APPLICANT: Haas, Rainer 
APPLICANT: Kleanthous-, Harold 
APPLICANT: Tomb, Jean-Francois 
APPLICANT: Miller, Charles 
APPLICANT: Al-Garawi, Amal 
APPLICANT: Odenbreit, Stefan 
APPLICANT: Meyer, Thomas 

TITLE OF INVENTION: Helicobacter Polypeptides and 
TITLE OF INVENTION: Corresponding Polynucleotide Molecules 
FILE REFERENCE: 06132/040002 
CURRENT APPLICATION NUMBER : US/09/98 8 , 067B 
CURRENT FILING DATE: 2003-01-31 
PRIOR APPLICATION NUMBER: US 08/831,309 
PRIOR FILING DATE: 1997-04-01 
NUMBER OF SEQ ID NOS: 112 
SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 78 
LENGTH: 306 
TYPE : PRT 



ORGANISM: Helicobacter pylori 
US-09-988-067B-78 



Query Match 64,8%; Score 35; DB 11; Length 306; 

Best Local Similarity 66.7%; Pred. No. 2.6e+02; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 

Qy 1 CNSRLHLRC 9 

MINI 
Db 91 CNLRNHLAC 99 



RESULT 4 

US-09-815-242-10452 

Sequence 10452, Application US/09815242 
Patent No. US20020061569A1 
GENERAL INFORMATION: 
APPLICANT: Haselbeck, Robert 
APPLICANT: Ohlsen, Kari L. 
APPLICANT: Zyskind, Judith W. 
APPLICANT: Wall, Daniel 
APPLICANT: Trawick, John D. 
APPLICANT: Carr, Grant J. 
APPLICANT: Yamamoto, Robert T, 
APPLICANT: Xu, H. Howard 

TITLE OF INVENTION: Identification of Essential Genes 
TITLE OF INVENTION: Prokaryotes 
FILE REFERENCE: ELITRA. 011A 
CURRENT APPLICATION NUMBER: US/09/815,242 
CURRENT FILING DATE: 2001-03-21 
PRIOR APPLICATION NUMBER: 60/191,078 
PRIOR FILING DATE: 2000-03-21 
PRIOR APPLICATION NUMBER: 60/206,848 
PRIOR FILING DATE: 2000-05-23 
PRIOR APPLICATION NUMBER: 60/207,727 
PRIOR FILING DATE: 2000-05-26 
PRIOR APPLICATION NUMBER: 60/242,578 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/253,625 
PRIOR FILING DATE: 2000-11-27 
PRIOR APPLICATION NUMBER: 60/257,931 
PRIOR FILING DATE: 2000-12-22 
PRIOR APPLICATION NUMBER: 60/269,308 
PRIOR FILING DATE: 2001-02-16 
NUMBER OF SEQ ID NOS : 14110 
SOFTWARE: Fast SEQ for Windows Version 4.0 
SEQ ID NO 10452 
LENGTH: 378 
TYPE: PRT 

ORGANISM: Escherichia coli 
US-09-815-242-10452 



m 



Query Match 64.8%; Score 35; DB 9; Length 378; 

Best Local Similarity 100.0%; Pred, No. 3.1e+02; 
Matches 6; Conservative 0; Mismatches 0; Indels 



0 ; Gaps 



QY 



4 RLHLRC 9 



1 1 1 1 1 1 

Db 132 RLHLRC 137 



RESULT 5 

US-09-815-242-5133 

Sequence 5133, Application US/09815242 
Patent No. US20020061569A1 
GENERAL INFORMATION: 
APPLICANT: Haselbeck, Robert 
APPLICANT: Ohlsen, Kari L. 
APPLICANT: Zyskind, Judith W. 
APPLICANT: Wall, Daniel 
APPLICANT: Trawick, John D. 
APPLICANT: Carr, Grant J. 
APPLICANT: Yamamoto, Robert T. 
APPLICANT: Xu, H. Howard 

TITLE OF INVENTION: Identification of Essential Genes in 
TITLE OF INVENTION: Prokaryotes 
FILE REFERENCE: ELITRA. 011A 
CURRENT APPLICATION NUMBER: US/ 0 9/ 8 15 , 242 
CURRENT FILING DATE: 2001-03-21 
PRIOR APPLICATION NUMBER: 60/191,078 
PRIOR FILING DATE: 2000-03-21 
PRIOR APPLICATION NUMBER: 60/206,848 
PRIOR FILING DATE: 2000-05-23 
PRIOR APPLICATION NUMBER: 60/207,727 
PRIOR FILING DATE: 2000-05-26 
PRIOR APPLICATION NUMBER: 60/242,578 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/253,625 
PRIOR FILING DATE: 2000-11-27 
PRIOR APPLICATION NUMBER: 60/257,931 
PRIOR FILING DATE: 2000-12-22 
PRIOR APPLICATION NUMBER: 60/269,308 
PRIOR FILING DATE: 2001-02-16 
NUMBER OF SEQ ID NOS : 14110 
SOFTWARE: Fast SEQ for Windows Version 4.0 
SEQ ID NO 5133 
LENGTH: 387 
TYPE : PRT 

ORGANISM: Pseudomonas aeruginosa 
US-09-815-242-5133 



Query Match 64.8%; 
Best Local Similarity 100.0% 
Matches 6; Conservative 



Score 35; DB 9; Length 387; 
Pred. No. 3.2e+02; 
0; Mismatches 0; Indels 



0 ; Gaps 



Qy 

Db 



4 RLHLRC 9 
136 RLHLRC 141 



RESULT 6 
US-09-954-433-8 

; Sequence 8, Application US/09954433 
; Patent No. US20020155562A1 



GENERAL INFORMATION: 

APPLICANT: Floh , Leopold 
Koenig, Kerstin 
Menge, Ulrich 

TITLE OF INVENTION: Glutathionylspermidine Synthetase and 

Processes for Recovery and Use Thereof 
NUMBER OF SEQUENCES; 9 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: Marshall, O 'Toole, Gerstein, Murray & Borun 

STREET: 233 South Wacker Drive/ 6300 Sears Tower 

CITY: Chicago 

STATE: Illinois 

COUNTRY: United States of America 

ZIP : 60606 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patent In Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER; US/0 9/954 , 4 33 

FILING DATE: 17-Sep-2001 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 09/33 0,740 

FILING DATE: <Unknown> 
ATTORNEY /AGENT INFORMATION: 

NAME: Zeller, James P. 

REGISTRATION NUMBER: 28,4 91 

REFERENCE/DOCKET NUMBER: 29473/35677 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (312) 474-6300 

TELEFAX: (312) 474-0448 
INFORMATION FOR SEQ ID NO: 8: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 573 amino acids 

TYPE: amino acid 

STRANDEDNESS : single 

TOPOLOGY: linear 
MOLECULE TYPE; protein 
HYPOTHETICAL: NO 
ANTI -SENSE: NO 
FEATURE : 

NAME/KEY: Modi f ied-s ite 

LOCATION: 191 

OTHER INFORMATION: /note= "Xaa = Lys or Asn" 
FEATURE : 

NAME/KEY: Modif ied-site 
LOCATION: 463 

OTHER INFORMATION: /note= "Xaa = Val or Asp" 
FEATURE : 

NAME/KEY: Modif ied-site 
LOCATION: 47 9 

OTHER INFORMATION: /note- "Xaa = Val or Gly" 
SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
09-954-433-8 



Query Match 64.8%; Score 35; DB 10; Length 573; 

Best Local Similarity 55.6%; Pred. No. 4 . 6e+02 ; 

Matches 5; Conservative 1; Mismatches 3; Indels 0; Gaps 

Qy 1 CNSRLHLRC 9 

h I I I I 
Db 236 CDHEFHLRC 244 



RESULT 7 

US-09-749-601A-10 

Sequence 10, Application US/09749601A 
Patent No. US20020128460A1 
GENERAL INFORMATION: 
APPLICANT: Nicolaides, Nicholas 
APPLICANT: Grasso, Luigi 
APPLICANT: Sass, Philip 
APPLICANT: Kinzler, Kenneth 
APPLICANT: Vogelstein, Bert 

TITLE OF INVENTION: A method for generating hypermutable 
TITLE OF INVENTION: plants 
FILE REFERENCE: 01107.00069 

CURRENT APPLICATION NUMBER: US/0 9/74 9 , 601A 
CURRENT FILING DATE: 2000-12-28 
PRIOR APPLICATION NUMBER: 60/183,333 
PRIOR FILING DATE: 2000-02-18 
NUMBER OF SEQ ID NOS : 14 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 10 
LENGTH: 1151 
TYPE : PRT 

ORGANISM: Arabidopsis thaliana 
US-09-749-601A-10 



Query Match 63.9%; 
Best Local Similarity 60.0%; 
Matches 6; Conservative 



Score 34.5; DB 10; 
Pred. No. l.le+03; 
3; Mismatches 0; 



Length 1151; 



Indels 



1 ; Gaps 



1; 



Qy 
Db 



1 CN-SRLHLRC 9 

II |::|hl 
8 00 CNASQMHLKC 8 09 



RESULT 8 

US-09-912-697-33 

; Sequence 33, Application US/09912697 

; Publication No. US20030068808A1 

; GENERAL INFORMATION: 

; APPLICANT: Nicolaides, Nicholas C 

; APPLICANT : Sass , Phil ip M 

; APPLICANT: Grasso, Luigi M 

; APPLICANT: Kline, J Bradford 

; TITLE OF INVENTION: METHODS FOR GENERATING ANTIBIOTIC RESISTANT MICROBES AND 
NOVEL 

; TITLE OF INVENTION: ANTIBIOTICS 
; FILE REFERENCE: MOR-0040 

; CURRENT APPLICATION NUMBER: US/09/912 , 697 



; CURRENT FILING DATE : 2001-07-25 
; NUMBER OF SEQ ID NOS : 3 9 

SOFTWARE: Patentln version 3.1 
; SEQ ID NO 33 

LENGTH: 1151 

TYPE: PRT 

ORGANISM: Arabidopsis thaliana 
US-09-912-697-33 

Query Match 63.9%; Score 34.5; DB 11; Length 1151; 

Best Local Similarity 60.0%; Pred. No. l.le+03; 

Matches 6; Conservative 3; Mismatches 0; Indels 1; Gaps 

Qy 1 CN-SRLHLRC 9 

II | = = l|:| 
Db 8 00 CNASQMHLKC 809 



RESULT 9 

US-09-983-802-580 

Sequence 580, Application US/09983802 
Publication No. US20030022185A1 
GENERAL INFORMATION: 
APPLICANT: Fischer et al . 

TITLE OF INVENTION: 123 Human Secreted Proteins 
FILE REFERENCE: PZ010P1 

CURRENT APPLICATION NUMBER: US/09/ 983 , 8 02 
CURRENT FILING DATE: 2001-10-25 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 09/227,357 
PRIOR FILING DATE: EARLIER FILING DATE: 1999-01-08 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: PCT/US98 /13684 
PRIOR FILING DATE: EARLIER FILING DATE: 1998-07-07 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/051,926 
PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/052,793 
PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/051,925 
PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/051,929 
PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/052,803 
PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/052,732 
PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/051,931 
PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/051,932 
PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/051,916 
PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/051,930 
PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/051,918 
PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/051,920 
PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 
PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER : 60/052,733 



PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 
r PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/052,795 
: PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 

: PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/051,919 

; PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/051,928 

PRIOR FILING DATE: EARLIER FILING DATE: 1997-07-08 
; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/055,722 

PRIOR FILING DATE: EARLIER FILING DATE: 1997-08-18 

• PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/055,723 

• PRIOR FILING DATE: EARLIER FILING DATE: 1997-08-18 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/055,948 

• PRIOR FILING DATE: EARLIER FILING DATE: 1997-08-18 

• PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/055,949 
■ PRIOR FILING DATE: EARLIER FILING DATE: 1997-08-18 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/055,953 

• PRIOR FILING DATE: EARLIER FILING DATE: 1997-08-18 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/055,950 

• PRIOR FILING DATE: EARLIER FILING DATE : 1997-08-18 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/055,947 

• PRIOR FILING DATE: EARLIER FILING DATE: 1997-08-18 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/055,964 
; PRIOR FILING DATE: EARLIER FILING DATE: 1997-08-18 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER : 60/056,360 
; PRIOR FILING DATE: EARLIER FILING DATE: 1997-08-18 
; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/055,684 
; PRIOR FILING DATE: EARLIER FILING DATE: 1997-08-18 
; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/055,984 
; PRIOR FILING DATE: EARLIER FILING DATE: 1997-08-18 
; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER : 60/055,954 
; PRIOR FILING DATE: EARLIER FILING DATE: 1997-08-18 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/058,785 

PRIOR FILING DATE: EARLIER FILING DATE: 1997-09-12 
; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/058,664 
; PRIOR FILING DATE: EARLIER FILING DATE: 1997-09-12 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/058,660 
; PRIOR FILING DATE: EARLIER FILING DATE: 1997-09-12 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/058,661 

PRIOR FILING DATE: EARLIER FILING DATE: 1997-09-12 
] NUMBER OF SEQ ID NOS : 672 
; SOFTWARE: Patent In Ver . 2.0 
; SEQ ID NO 580 

LENGTH: 4 0 

TYPE : PRT 

ORGANISM: Homo sapiens 
US-09-983-802-580 



Query Match 63 . 0%; 

Best Local Similarity 44.4%; 
Matches 4 / Conservative 

Qy 1 CNSRLHLRC 9 

I = I = I = = I 
Db 2 CVTRMHVKC 10 



Score 34; DB 11; Length 40; 
Pred. No. 59; 
4; Mismatches 1; Indels 0; Gaps 0; 



RESULT 10 



US-10-029-386-28271 

Sequence 28271, Application US/10029386 
Publication No. US20030194704M 
GENERAL INFORMATION: 
APPLICANT: Penn, Sharron G. 
APPLICANT: Rank, David R. 
APPLICANT: Hanzel , David K. 

TITLE OF INVENTION: HUMAN GENOME -DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR GENE 

TITLE OF INVENTION: EXPRESSION ANALYSIS TWO 
FILE REFERENCE: AEOMICA-X-2 
CURRENT APPLICATION NUMBER: US/lO/029, 386 
CURRENT FILING DATE: 2001-12-20 
NUMBER OF SEQ ID NOS: 34288 

SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 
SEQ ID NO 28271 
LENGTH: 65 
TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE : 

MAP TO CHR22_12.0 

EXPRESSED IN BONE MARROW, SIGNAL =1.7 
EXPRESSED IN BRAIN, SIGNAL =1.8 
EXPRESSED IN LUNG, SIGNAL =1.8 
EXPRESSED IN FETAL LIVER, SIGNAL =1.9 
EXPRESSED IN PLACENTA, SIGNAL =2.1 
EXPRESSED IN ADULT LIVER, SIGNAL =1.8 



OTHER INFORMATION 
OTHER INFORMATION 
OTHER INFORMATION 
OTHER INFORMATION 
OTHER INFORMATION 
OTHER INFORMATION 
OTHER INFORMATION 
US-10-029-386-28271 



Query Match 63.0%; Score 34; DB 12; Length 65; 

Best Local Similarity 55.6%; Pred. No. 92; 

Matches 5; Conservative 1; Mismatches 3; Indels 



0 ; Gaps 



Qy 
Db 



1 CNSRLHLRC 9 

I Ml I 
51 CTSSMHLSC 59 



RESULT 11 

US-09-771-161A-118 

; Sequence 118, Application US/09771161A 

; Patent No. US20020110811A1 

; GENERAL INFORMATION: 

; APPLICANT: LEVINE, et al . 

; TITLE OF INVENTION: VARIANTS Of PROTEIN KINASES 

; FILE REFERENCE: 8 02620-2005.1 

; CURRENT APPLICATION NUMBER: US/09/771 , 161A 

; CURRENT FILING DATE: 2001-01-26 

; PRIOR APPLICATION NUMBER: 09/724,676 

; PRIOR FILING DATE: 2000-11-28 

; PRIOR APPLICATION NUMBER: 136776 

; PRIOR FILING DATE: 2000-06-15 

; PRIOR APPLICATION NUMBER: 135619 

; PRIOR FILING DATE: 2000-04-12 

; NUMBER OF SEQ ID NOS: 273 

SOFTWARE: Patentln version 3.0 
; SEQ ID NO 118 



LENGTH: 319 
TYPE; PRT 

ORGANISM : Homo sapiens 
US-09-771-161A-118 

Query Match 63.0%; Score 34; DB 10; Length 319; 

Best Local Similarity 44.4%; Pred. No, 4e+02; 

Matches 4; Conservative 3; Mismatches 2; Indels 0; Gaps 

Qy 1 CNSRLHLRC 9 

Ih :| =1 
Db 192 CNAAI HKKC 200 



RESULT 12 

US-09-771-161A-209 

; Sequence 209, Application US/09771161A 

; Patent No. US20020110811A1 

; GENERAL INFORMATION: 

; APPLICANT: LEVINE , et al . 

; TITLE OF INVENTION: VARIANTS Of PROTEIN KINASES 

; FILE REFERENCE: 802620-2005.1 

; CURRENT APPLICATION NUMBER: US/09/771 , 161A 

; CURRENT FILING DATE: 2001-01-26 

; PRIOR APPLICATION NUMBER: 09/724,676 

; PRIOR FILING DATE: 2000-11-28 

; PRIOR APPLICATION NUMBER: 136776 

; PRIOR FILING DATE: 2000-06-15 

; PRIOR APPLICATION NUMBER: 135619 

; PRIOR FILING DATE: 2000-04-12 

; NUMBER OF SEQ ID NOS : 273 

SOFTWARE: Patentln version 3.0 
; SEQ ID NO 209 

LENGTH: 676 

TYPE : PRT 

ORGANISM: Homo sapiens 
US-09-771-161A-209 

Query Match 63.0%; Score 34; DB 10; Length 676; 

Best Local Similarity 44.4%; Pred. No. 7.9e+02; 

Matches 4; Conservative 3; Mismatches 2; Indels 0; Gaps 0 

Qy 1 CNSRLHLRC 9 

Ih : I H 
Db 192 CNAAI HKKC 200 



RESULT 13 
US-09-749-956-2 

; Sequence 2, Application US/ 09749956 
; Patent No. US20020068271A1 
; GENERAL INFORMATION: 

APPLICANT: La Jolla Institute For Allergy 

APPLICANT: Alt man, Amnon 
; APPLICANT: Coudronniere, No. US20020068271Alwenn 

; TITLE OF INVENTION: METHODS FOR IDENTIFYING AGENTS CAPABLE OF MODULATING 
PROTEIN KINASE C 



; TITLE OF INVENTION: THETA (PKC?) ACTIVITY 

; FILE REFERENCE: 051501/02763 90 

; CURRENT APPLICATION NUMBER: US/0 9/74 9,956 

; CURRENT FILING DATE: 2000-12-27 

; PRIOR APPLICATION NUMBER: 60/173,171 

; PRIOR FILING DATE: 1999-12-27 

; NUMBER OF SEQ ID NOS : 3 

; SOFTWARE: Patentln version 3.0 

; SEQ ID NO 2 

LENGTH: 706 

TYPE : PRT 

ORGANISM: Human 
US-09-749-956-2 

Query Match 63.0%; Score 34; DB 9; Length 706; 

Best Local Similarity 44.4%; Pred. No. 8.2e+02; 

Matches 4; Conservative 3; Mismatches 2; Indels 0; Gaps 0 

Qy 1 CNSRLHLRC 9 

11= :| =1 
Db 193 CNAAIHKKC 2 01 



RESULT 14 
US-10-100-818-4 

; Sequence 4, Application US/10100818 

; Publication No. US20030176333A1 

; GENERAL INFORMATION: 

; APPLICANT: Lorens , James B. 

; APPLICANT: Xu, Weiduan 

; APPLICANT: Bogenberger, Jakob 

; APPLICANT: Rigel Pharmaceuticals, Incorporated 

TITLE OF INVENTION: CASPR3 : Modulators of Angiogenesis 
; FILE REFERENCE: 021044 - 001900US 
; CURRENT APPLICATION NUMBER: US/10/1 0 0 , 8 18 
; CURRENT FILING DATE: 2002-03-18 
; NUMBER OF SEQ ID NOS: 14 
; SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 4 

LENGTH: 1154 

TYPE: PRT 
; ORGANISM: Homo sapiens 

FEATURE : 

; OTHER INFORMATION: full length contactin associated protein 3 

OTHER INFORMATION: (CASPR3) 
US-10-100-818-4 



Query Match 63 . 0%; 

Best Local Similarity 66.7%; 
Matches 6; Conservative 

Qy 1 CNSRLHLRC 9 

Db 542 CEQRLALRC 550 



Score 34; DB 12; Length 1154; 
Pred. No. 1.3e+03; 
0; Mismatches 3; Indels 0; Gaps 0 



RESULT 15 



US-10-100-608B-9 

; Sequence 9, Application US/10100608B 

; Publication No. US20030104412A1 

; GENERAL INFORMATION: 

; APPLICANT: Heiskala, Marja 

; TITLE OF INVENTION: REG-LIKE PROTEIN 

; FILE REFERENCE: CDS-261 

; CURRENT APPLICATION NUMBER: US/ 10/100 , 608B 

; CURRENT FILING DATE: 2002-09-10 

; PRIOR APPLICATION NUMBER: 60/276,414 

; PRIOR FILING DATE: 2002-03-16 

; NUMBER OF SEQ ID NOS : 45 

; SOFTWARE: Patent In version 3.1 

; SEQ ID NO 9 

LENGTH: 2 9 

TYPE: PRT 

ORGANISM: Human 
US-10-100-608B-9 

Query Match 61.1%; Score 33; DB 15; Length 29; 

Best Local Similarity 55.6%; Pred. No. 65; 

Matches 5; Conservative 0; Mismatches 4; Indels 0; Gaps 
Qy 1 CNSRLHLRC 9 



Db 



18 CNKRQHFLC 26 




Search completed: November 13, 2003, 09:58:27 
Job time : 19.6562 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2 0 03 Compugen Ltd, 



OM protein - protein search, using sw model 



Run on; 



November 13, 2 003, 09:38:30 ; Search time 9.375 Seconds 

(without alignments) 
92.322 Million cell updates/sec 



Title; 

Perfect score 
Sequence: 



US-09-228-866-1 
54 

1 CNSRLHLRC 9 



Scoring table 



BLOSUM62 
Gapop 10.0 



Gapext 0 . 5 



Searched: 



283308 seqs, 96168682 residues 



Total number of hits satisfying chosen parameters: 



283308 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing: Minimum Match 0% 

Maximum Match 100% 



Listing first 45 summaries 



Database : PIRJ76:* 
1: pirl:* 
2: pir2:* 
3: pir3:* 
4: pir4:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
T13170 

diaphanous protein - fruit fly (Drosophila melanogaster) 
C; Species: Drosophila melanogaster 

C;Date: 09-Jun-2000 #sequence__revision 09-Jun-2000 #text_change 17-Nov-2000 
C; Access ion: T13170 

R; Castrillon, D.H.; Wasserman, S.A.; Castrillon, D.H.; Wasserman, S.A. 
Development 120, 3367-3377, 1994 

A; Title: Diaphanous is required for cytokinesis in Drosophila and shares domains 
of similarity with the products of the limb deformity gene. 
A; Reference number: Z17626; MUID : 95121197 ; PMID:7821209 
A /Access ion: T13170 

A;Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-1091 <CAS> 

A; Cross-references : EMBL:U11288; NID:g575926; PID:g575927; PIDN: AAA67715 . 1 
C; Genetics : 
A; Gene: dia 

A; Cross-references : FlyBase: FBgn0011202 
A ; Map position: 2L 

Query Match 72.2%; Score 39; DB 2; Length 1091; 

Best Local Similarity 87.5%; Pred. No. 24; 

Matches 7; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 
Qy 2 NSRLHLRC 9 

I MINI 

Db 307 NFRLHLRC 314 



RESULT 2 
T50561 

SINAI protein [imported] - Vitis vinifera 
C; Species.- Vitis vinifera 

C;Date: 21-Jul-2000 #sequence_jrevision 21-Jul-2000 #text_change 24-May-2001 
C; Access ion: T50561 

R;Brehm, I,; Korfei, M.; Preisig-Mueller , R. ; Kindl , H. 
submitted to the EMBL Data Library, November 1998 

A; Description: A nuclear localized zinc finger protein found in a plant is 
homologous to the Drosophila signal tranducing factor seven in absentia. 
A;Reference number: Z25132 
A; Access ion: T5 0561 

A;Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-315 <BRE> 

A; Cross -references: EMBL:Y18471; PIDN : CAB4 0577 . 1 

C; Super family: Drosophila developmental protein sina; RING finger homology 



Query Match 70.4%; Score 38; DB 2; Length 315; 

Best Local Similarity 66.7%; Pred. No. 13; 

Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 0; 



Qy 1 CNSRLHLRC 9 

I I hi II 
Db 78 CKSRVHNRC 86 



RESULT 3 
A82581 

periplasmic proteinase XF2241 [imported] - Xylella fastidiosa (strain 9a5c) 
C; Species: Xylella fastidiosa 

C;Date: 18-Aug-2000 #sequence_revision 20-Aug-2000 #text_change 02-Sep-2000 
C; Access ion: A82581 

R ; anonymous , The Xylella fastidiosa Consortium of the Organization for 
Nucleotide Sequencing and Analysis, Sao Paulo, Brazil. 
Nature 406, 151-157, 2000 

A; Title: The genome sequence of the plant pathogen Xylella fastidiosa. 

A/Reference number: A82515; MUID : 20365717 ; PMID : 10910347 

A; Note: for a complete list of authors see reference number A59328 below 

A;Accession: A82581 

A; Status: preliminary 

A; Molecule type: DNA 

A;Residues: 1-514 <SIM> 

A;Cross-references: GB:AE004037; GB:AE003849; NID : g91073 94 ; PIDN : AAF85040 . 1 ; 

GSPDB:GN00128; XFSC:XF2241 

A; Experimental source: strain 9a5c 

R;Simpson, A.J.G.; Reinach, F.C.; Arruda, P.; Abreu, F.A.; Acencio, M. ; 
Alvarenga, R. ; Alves, L.M.C.; Araya, J.E.; Baia, G.S.; Baptista, C.S.; Barros, 
M.H.; Bonaccorsi, E.D.; Bordin, S.; Bove, J.M.; Briones, M.R.S.; Bueno, M.R.P.; 
Camargo, A. A. ; Camargo, L.E.A.; Carraro, D.M. ; Carrer, H. ; Colauto, N.B.; 
Colombo, C; Costa, F.F.; Costa, M.C.R.; Costa-Neto, CM. ; Coutinho, L.L.; 
Cristofani, M. ; Dias-Neto, E. ; Docena, C. ; El-Dorry, H. ; Facincani, A. P.; 
Ferreira, A-J.S. 
submitted to GenBank, June 2 000 

A;Authors: Ferreira, V.C.A.; Ferro, J. A.; Fraga, J.S.; Franca, S.C.; Franco, 
M.C.; Frohme, M. ; Furlan, L.R.; Gamier, M . ; Goldman, G.H.; Goldman, M.H.S.; 
Gomes, S.L.; Gruber, A.; Ho, P.L.; Hoheisel, J.D.; Junqueira, M.L.; Kemper, 
E.L.; Kitajima, J. P.; Krieger, J.E.; Kuramae, E.E.; Laigret, F. ; Lambais, M.R.; 
Leite, L.C.C. ; Lemos, E.G.M. ; Lemos, M.V.F.; Lopes, S.A.; Lopes, C.R.; Machado, 
J. A.; Machado, M.A.; Madeira, A.M.B.N.; Madeira, H.M.F.; Marino, C.L.; Marques, 
M.V.; Martins, E.A.L. 

A;Authors: Martins, E.M.F.; Matsukuma, A.Y.; Menck, C.F.M.; Miracca, E.C.; 
Miyaki, C.Y.; Monteiro-Vitorello , C.B.; Moon, D.H. ; Nagai, M.A. ; Nascimento, 

A. L.T.O.; Net to, L.E.S.; Nhani Jr., A.; Nobrega , F.G.; Nunes , L.R.; Oliveira, 
M.A.; de Oliveira, M.C.; de Oliveira, R.C.; Palmieri, D.A.; Paris, A.; Peixoto, 

B. R.; Pereira, G.A.G.; Pereira Jr., H.A.; Pesquero, J.B. ; Quaggio, R.B.; 
Roberto, P.G.; Rodrigues, V.; Rosa, A.J. de M . ; de Rosa Jr., V.E.; de Sa, R.G. ; 
Santelli, R.V.; Sawasaki, H . E . 

A;Authors: da Silva, A.C.R.; da Silva, F.R.; da Silva, A.M.; Silva Jr., W.A.; da 
Silveira, J.F.; Silvestri, M.L.2.; Siqueira, W.J.; de Sousa, A. A. ; de Souza, 
A. P.; Terenzi, M.F.; Truffi, D. ; Tsai, S.M.; Tsuhako, M.H.; Vallada, H. ; Van 
Sluys, M.A.; Verjovski -Almeida, S.; Vet tore, A.L.; Zago, M.A. ; Zatz, M. ; 
Meidanis, J.; Setubal, J.C. 
A; Reference number: A59328 



A; Contents : annotation 
C; Genetics : 
A;Gene: XF2241 

C;Superfamily: Helicobacter serine proteinase 

Query Match 70.4%; Score 38; DB 2; Length 514; 

Best Local Similarity 75.0%; Pred. No. 19; 

Matches 6; Conservative 1; Mismatches 1; Indels 0; Gaps 

Qy 2 NSRLHLRC 9 

llhl II 
Db 2 NSRIHTRC 9 



RESULT 4 
TVFFDF 

protein kinase Draf-1 (EC 2.7.1.-) - fruit fly (Drosophila melanogaster) 
N;Alternate names: Draf-1 proto-oncogene protein-serine/threonine kinase; 
kinase-related transforming protein Draf-1; pole-hole protein 
C; Species: Drosophila melanogaster 

C;Date: 31-Mar-1991 #sequence_revision 23-Feb-1996 #text_change 23-Feb-1997 
C;Accession: S00393; S60191; A27808; S33602 

R ; Nishida, Y . ; Hata, M. ; Ayaki, T. ; Ryo, H. ; Yamagata, M. ; Shimizu, K. ; 

Nishizuka, Y. 

EMBO J . 7 , 775-781, 1988 

A;Title: Proliferation of both somatic and germ cells is affected in the 

Drosophila mutants of raf proto-oncogene. 

A;Re ference number: S00393; MUID: 88283647 ; PMID:3135183 

A; Accession: SO 03 93 

A;Molecule type: DNA 

A;Residues: 1-781 <NIS> 

A;Cross-references : EMBL:X07181 

A;Note: the assignment of the start codon has been revised in reference S33602 
A; Access ion: S60191 
A;Molecule type: mRNA 
A;Residues: 148-781 <NIS2> 

R;Mark, G.E.; Maclntyre, R.J.; Digan, M.E.; Ambrosio, L. ; Perrimon, N. 
Mol. Cell. Biol. 7, 2134-2140, 1987 

A; Title: Drosophila melanogaster homologs of the raf oncogene. 
A;Reference number: A27808; MUID: 87257926 ; PMID:3037346 
A;Accession: A27808 
A;Molecule type: mRNA 

A;Residues: 'LQ' ,465-519, 'R' ,521, ' A ' , 523 - 570 , ' R ' , 572 - 6 99 , ' PQAL * ,704- 
713, ! PT' , 716-753 <MAR> 

R;Sprenger, F. ; Trosclair, M.M.; Morrison, D.K. 
Mol. Cell. Biol. 13, 1163-1172, 1993 

A;Title: Biochemical analysis of torso and D~raf during Drosophila 
embryogenesis : implications for terminal signal transduction. 
A;Reference number: S33602; MUID: 93140754 ; PMID:8423783 
A; Contents : annotation 

A;Note: this is a revision of the assignment of the start codon in reference 
S00393 

A;Note: the authors call the N-terminal extended version of the protein Draf-3 

A;Note: the cited sequence in S33602 shows Pro for residue 342 

C;Genetics : 

A; Gene: Draf-1 

A ; Cros s - references : FlyBa s e : FBgnO 003079 



A ; Map position: X 2F 
A;Introns: 417/3; 464/3; 589/2 

C; Superfamily : protein kinase A-raf; protein kinase C zinc -binding repeat 
homology; protein kinase homology 

C; Keywords: ATP; phosphotransferase; proto -oncogene ; serine/threonine-specif ic 
protein kinase; transforming protein 

F;265-310/Domain: protein kinase C zinc-binding repeat homology <K22> 
F; 469-735/Domain: protein kinase homology <KIN> 
F;477-485/Region: protein kinase ATP-binding motif 
F;497/Active site: Lys #status predicted 

Query Match 70,4%; Score 38; DB 1; Length 781; 

Best Local Similarity 66.7%; Pred. No. 28; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 0 

Qy 1 CNSRLHLRC 9 

II I I II 
Db 294 CNFRFHQRC 302 



RESULT 5 
T08924 

hypothetical protein T15N24.30 - Arabidopsis thaliana 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: ll-Jun-1999 #sequence_revision ll-Jun-1999 #text_change 02-Sep-2000 
C; Accession: T08924 

R;Bevan, M. ; Zimmermann, W.; Grueneisen, A.; Warnbutt, R. ; Bancroft, I.; Mewes, 

H.W.; Mayer, K.F.X.; Lemcke, K. ; Schueller, C. 

submitted to the Protein Sequence Database, May 19 9 9 

A; Reference number: Z16518 

A; Accession: T08924 

A; Molecule type: DNA 

A; Residues : 1-4 64 <BEV> 

A; Cross-references: EMBL: AL078465 ; GSPDB :GN00062 ; ATSP : T15N24 . 30 
A; Experimental source: cultivar Columbia; BAC clone T15N24 
C; Genetics : 

A;Gene: ATSP : T15N24 . 30 
A; Map position: 4 

A;Introns: 38/2; 84/3; 106/3; 297/2; 416/3 

C; Superfamily: RING finger homology 

F; 414 -464 /Domain: RING finger homology <RRN> 

Query Match 68.5%; Score 37; DB 2; Length 464; 

Best Local Similarity 55.6%; Pred. No. 27; 

Matches 5; Conservative 2; Mismatches 2; Indels 0; Gaps 0 
Qy 1 CNSRLHLRC 9 

Db 436 CSHRFHLKC 444 



RESULT 6 
S47645 

tMDC I protein - crab-eating macaque 

C; Species: Macaca fascicularis (crab-eating macaque) 

C;Date: 27-Jan-1995 #sequence_revision 27-Jan-1995 #text_change 21-Jul-2000 
C; Access ion: S47645 



R ; Barker, H.L.; Perry, A.C.F.; Jones, R. ; Hall, L. 
Biochim. Biophys. Acta 1218, 429-431, 1994 

A;Title: Sequence and expression of a monkey testicular transcript encoding tMDC 
I, a novel member of the metalloproteinase-like, disintegrin-like, cysteine-rich 
(MDC) protein family. 

A;Reference number: S47645; MUID: 94325353 ; PMID:8049267 
A; Access ion: S47645 
A; Status : preliminary 
A; Molecule type: mRNA 
A;Residues: 1-736 <BAR> 

A; Cross-references: EMBL:X76637; NID:g535016; PIDN : CAA54 085 . 1 ; PID:g535017 
C;Superfamily: mouse meltrin alpha; disintegrin homology 
F;392-477/Domain: disintegrin homology <DIS> 

Query Match 68.5%; Score 37; DB 2; Length 736; 

Best Local Similarity 55.6%; Pred. No. 41; 

Matches 5; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

II I I =1 
Db 629 CNDRFHCQC 637 



RESULT 7 
T27924 

hypothetical protein ZK593.4 - Caenorhabditis elegans 
C;Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 21-Jul-2000 
C;Accession: T27924 
R;McMurray, A. 

submitted to the EMBL Data Library, February 19 96 
A;Reference number: Z20440 
A;Accession: T27924 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A;Residues: 1-1430 <WIL> 

A; Cross-references: EMBL:Z69385; PIDN: CAA93426 . 1 ; GSPDB : GN00022 ; CESP:ZK593.4 

A; Experimental source: clone ZK593 

C; Genetics : 

A;Gene: CESP:ZK593.4 

A; Map position: 4 

A;Introns: 48/3; 92/3; 238/2; 254/1; 924/2; 987/1; 1085/3; 1304/2; 1404/1 
C;Superfamily : human retinoblastoma binding protein 2 

Query Match 68.5%; Score 37; DB 2; Length 1430; 

Best Local Similarity 55.6%; Pred. No. 73; 

Matches 5; Conservative 2; Mismatches 2; Indels 0; Gaps 

Qy 1 CNSRLHLRC 9 

hi 1 = 11 
Db 1179 CDSEFHVRC 1187 



RESULT 8 
T02395 

hypothetical protein At2g44400 [imported] - Arabidopsis thaliana 
N;Alternate names: hypothetical protein F4I1.21 



C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 05-Mar-1999 #sequence_revision 05-Mar-1999 #text_change 02-Feb-2001 
C;Accession: T02395; A84878 

R;Rounsley, S.D.; Lin, x. ; Ketchum, K.A. ; Crosby, M.L. ; Brandon, R.C; Sykes, 
S.M.; Kaul, S.; Mason, T.M.; Kerlavage, A.R.; Adams, M . D . ; Somerville, C.R.; 
Venter, J.C. 

submitted to the EMBL Data Library, May 19 98 

A;Description: Arabidopsis thaliana chromosome II BAC F4I1 genomic sequence. 
A; Reference number: Z14667 
A/Accession: T02395 

A; Status: translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-14 6 <ROU> 

A; Cross -references: EMBL: AC004521 ; NID :g3128166 ; PID:g3128182 
A; Experimental source: cultivar Columbia 

R;Lin, X.; Kaul, S.; Rounsley, S.D.; Shea, T.P.; Benito, M.I.; Town, CD. ; 
Fujii, C.Y.; Mason, T.M.; Bowman, C.L.; Barnstead, M.E.; Feldblyum, T.V.; Buell, 
C.R. ; 'Ketchum, K.A. ; Lee, J.J.; Ronning, CM.; Koo, H. ; Moffat, K.S.; Cronin, 
L.A.; Shen, M.; VanAken, S.E.; Umayam, L. ; Tallon, L.J.; Gill, J.E.; Adams, 
M.D.; Carrera, A. J.; Creasy, T.H.; Goodman, H.M.; Somerville, C.R.; Copenhaver, 
CP.; Preuss, D.; Nierman, W.C.; White, 0. ; Eisen, J. A.; Salzberg, S.L.; Fraser, 
CM.; Venter, J.C 
Nature 402, 761-768, 1999 

A; Title: Sequence and analysis of chromosome 2 of the plant Arabidopsis 
thaliana . 

A;Reference number: A84420; MUID : 20083487 ; PMID: 10617197 
A;Accession: A84878 
A; Status : preliminary 
A; Molecule type: DNA 
A/Residues : 1-146 <STO> 

A; Cross -references: GB:AE002093; NID : g3 128182 ; PIDN: AAC16086 . 1 ; GSPDB : GN00139 
C;Genetics : 

A; Gene: At2g444 00; F4I1.21 
A; Map position: 2 



Query Match 66.7%; 
Best Local Similarity 55.6%; 
Matches 5; Conservative 



Score 36; DB 2; 
Pred. No. 15; 
2; Mismatches 



Length 14 6; 
2; Indels 



0; Gaps 



0; 



Qy 

Db 



1 CNSRLHLRC 9 
II 

18 CNFYIHLKC 26 



RESULT 9 
S55134 

probable membrane protein YMR187C - yeast (Saccharomyces cerevisiae) 
N;Alternate names: hypothetical protein YM8010.17C 
C; Species: Saccharomyces cerevisiae 

C;Date: 08-Jul-1995 #sequence_revision Ol-Sep-1995 #text_change 19-Apr-2002 
C; Access ion: S55134 
R;Churcher, CM. 

submitted to the EMBL Data Library, June 1995 
A;Reference number: S55118 
A; Access ion: S55134 
A; Molecule type: DNA 
A;Residues: 1-431 <CHU> 



A; Cross-references : EMBL: Z4 98 08 ; NID:g854440; PID :g854457 ; GSPDB : GN00013 ; 
MIPS:YMR187c 

A; Experimental source: strain AB972 

C;Genetics : 

A;Gene: MIPS : YMR187C 

A; Cross-references : SGD:S0004799 

A ; Map position: 13R 

C;Superfamily: Saccharomyces cerevisiae probable membrane protein YMR187c 
C; Keywords: transmembrane protein 

F;228-244/Domain: transmembrane #status predicted <TMM> 

Query Match 66.7%; Score 36; DB 2; Length 431; 

Best Local Similarity 55.6%; Pred. No. 39; 

Matches 5; Conservative 2; Mismatches 2; Indels 0; Gaps 

Qy 1 CNSRLHLRC 9 

II ::| II 
Db 37 CNLQIHKRC 4 5 



RESULT 10 
E86165 

F15K9.2 protein - Arabidopsis thaliana 

C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 02-Mar-2001 #sequence__revision 02-Mar-20Ol #text__change 09-Nov-2001 
C; Access ion: E86165 

R;Theologis, A.; Ecker, J.R.; Palm, C.J.; Federspiel, N.A, ; Kaul , S.; White, 0. ; 
Alonso, J.; Altaf, H. ; Araujo, R. ; Bowman, C.L. ; Brooks, S.Y.; Buehler, E. ; 
Chan, A.; Chao, Q. ; Chen, H. ; Cheuk, R.F.; Chin, C.W.; Chung, M.K.; Conn, L. ; 
Conway, A.B.; Conway, A.R.; Creasy, T.H.; Dewar, K. ; Dunn, P.; Etgu, P.; 
Feldblyum, T.V.; Feng, J.; Fong, B.; Fujii, C.Y. ; Gill, J.E.; Goldsmith, A.D. ; 
Haas, B.; Hansen, N.F.; Hughes, B.; Huizar, L. 
Nature 408, 816-820, 2000 

A;Authors: Hunter, J.L.; Jenkins, J.; Johnson -Hop son, C. ; Khan, S.; Khaykin, E . ; 
Kim, C.J.; Koo, H.L.; Kremenet skaia , I.; Kurtz, D.B. ; Kwan, A.; Lam, B.; Langin- 
Hooper, S.; Lee, A.; Lee, J.M.; Lenz, C.A.; Li, J.H. ; Li, Y. ; Lin, X.; Liu, 
S.X.; Liu, Z.A.; Luros, J.S.; Maiti, R . ; Marziali, A.; Militscher, J.; Miranda, 
M. ; Nguyen, M. ; Nierman, W.C.; Osborne, B.I.; Pai, G . ; Peterson, J.; Pham, P.K.; 
Rizzo, M.; Rooney, T. ; Rowley, D. ; Sakano, H. 

A;Authors: Salzberg, S.L.; Schwartz, J.R.; Shinn, P.; Southwick, A.M.; Sun, H.; 
Tallon, L.J.; Tambunga, G. ; Toriumi , M.J.; Town, CD.,- Utterback, T. ; van Aken, 
S.; Vaysberg, M. ; Vysotskaia, V.S.; Walker, M. ; Wu, D. ; Yu, G.; Fraser, CM.; 
Venter, J.C.; Davis, R.W. 

A; Title: Sequence and analysis of chromosome 1 of the plant Arabidopsis. 

A; Reference number: A86141; MUID: 21016719 ; PMID: 1113 0712 

A; Access ion: E86165 

A; Status : preliminary 

A; Molecule type: DNA 

A;Residues: 1-1020 <STO> 

A; Cross-references: GB:AE005172; NID : g3850588 ; PIDN: AAC72128 . 1 ; GSPDB : GN0014 1 

C; Genetics : 

A ; Map position: 1 

Query Match 66.7%; Score 36; DB 2; Length 1020; 

Best Local Similarity 75.0%; Pred. No. 83; 

Matches 6; Conservative 1; Mismatches 1; Indels 0; Gaps 0; 



Qy 1 CNSRLHLR 8 

I hllll 
Db 645 CQSKLHLR 652 



RESULT 11 
H97836 

DNA ligase (NAD) (EC 6.5.1.2) - Rickettsia conorii (strain Malish 7) 
C; Species: Rickettsia conorii 

C;Date: 30-Sep-2001 #sequence__revision 30-Sep-2001 #text_change 03-Jun-2002 
C;Accession; H97836 

R;Ogata, H.; Audic, S.; Renesto-Audif f ren, P.; Fournier, P.E.; Barbe, V. ; 
Samson, D. ; Roux, V.; Cossart, P.; Weissenbach, J.; Claverie, J.M. ; Raoult, D. 
Science 293, 2093-2098, 2001 

A;Title: Mechanisms of Evolution in Rickettsia conorii and Rickettsia 
prowazekii . 

A/Reference number: A97700; MUID: 21442074 ; PMID : 11557893 
A;Accession: H97836 
A; Status : preliminary 
A; Molecule type: DNA 
A;Residues: 1-689 <KUR> 

A; Cross-references: GB:AE006914; PIDN : AAL03634 . 1 ; PID:gl5620219; GSPDB : GN00173 
C; Genetics : 
A; Gene: lig 

C;Superfamily : polydeoxyribonucleotide synthase (NAD+) 
C ; Keywords : 1 igase 

Query Match 65.7%; Score 35.5; DB 2; Length 68 9 ; 

Best Local Similarity 43.8%; Pred. No. 73; 

Matches 7; Conservative 2; Mismatches 0; Indels 7; Gaps 1 



Qy 1 CNSRLH LRC 9 

Ilhll 

Db 416 CNSKLHYTPEDI I I RC 431 



RESULT 12 
JH0460 

corticostatic peptide GP-CS3 - guinea pig 
C; Species: Cavia porcellus (guinea pig) 

C;Date: 30-Jun-1992 #sequence_revision 30-Jun-1992 #text_change 18-Aug-2000 
C;Accession: JH0460 

R;Hu, J.; Bennett, H.P.J. ; Lazure, C. ; Solomon, S. 
Biochem. Biophys . Res. Commun. 180, 558-565, 1991 

A;Title: Isolation and characterization of corticostatic peptides from guinea 
pig bone marrow. 

A;Reference number: JH0458; MUID : 92 062 075 ; PMID: 1659400 

A; Access ion: JH0460 

A;Molecule type: protein 

A; Residues: 1-13 <HUJ> 

A; Experimental source: bone marrow 

A;Note: this is a dimer having an antiparallel configuration 

C; Comment: This peptide belongs to a family of Cys-rich, cationic peptides of 
low molecular weight. 

C;Comment: This peptide has antimicrobial activity by a non- oxygen -dependent 
mechanism. 

C; Super family : unassigned animal peptides 



F;5/Disulf ide bonds: interchain (to 13) #status experimental 

F;7/Disulf ide bonds: interchain (to 11) #status experimental 

F; 11/Disulf ide bonds: interchain (to 7) #status experimental 

F; 13/Disulf ide bonds: interchain (to 5) #status experimental 

Query Match 64.8%; Score 35; DB 2; Length 13; 

Best Local Similarity 66.7%; Pred. No. 2.8; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 

Qy 1 CNSRLHLRC 9 

I III II 
Db 5 CFCRLHCRC 13 



RESULT 13 
E84774 

probable RING zinc finger protein [imported] - Arabidopsis thaliana 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 02~Feb~2001 #sequence_revision 02-Feb-2001 #text_change 02-Feb-2001 
C;Accession: E84774 

R;Lin, X.; Kaul , S.; Rounsley, S.D.; Shea, T.P.; Benito, M.I.; Town, CD.; 
Fujii, C.Y.; Mason, T.M.; Bowman, C.L.; Barnstead, M.E.; Feldblyum, T.V.; Buell, 
C.R.; Ketchum, K.A. ; Lee, J.J.; Ronning, CM.; Koo, H. ; Moffat, K.S.; Cronin, 
L.A.; Shen, M.; VanAken, S.E.; Umayam, L.; Tallon, L.J.; Gill, J.E.; Adams, 
M.D.; Carrera, A.J. ; Creasy, T.H.; Goodman, H.M. ; Somerville, C.R.; Copenhaver, 
CP.; Preuss, D.; Nierman, W.C; White, C; Eisen, J. A.; Salzberg, S.L.; Fraser, 
CM. ; Venter, J.C 
Nature 402, 761-768, 1999 

A; Title: Sequence and analysis of chromosome 2 of the plant Arabidopsis 
thaliana . 

A;Reference number: A84420; MUID : 20083487 ; PMID : 10617197 
A; Accession: E84774 
A; Status : preliminary 
A; Molecule type: DNA 
A;Residues: 1-180 <STO> 

A; Cross-references: GB:AE002093; NID : g45 10378 ; PIDN: AAD21466 . 1 ; GSPDB : GN00139 
C; Genetics : 
A;Gene: At2g35910 
A; Map position: 2 

Query Match 64.8%; Score 35; DB 2; Length 180; 

Best Local Similarity 55.6%; Pred. No. 28; 

Matches 5; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

II Ihl 
Db 12 9 CNHLFHLKC 137 



RESULT 14 
S62511 

probable peptide methionine sulfoxide reductase - fission yeast 
(Schizosaccharomyces pombe) 
C; Species: Schizosaccharomyces pombe 

C;Date: 12-Feb-1998 #sequence_revision 20-Feb-1998 #text_change 10-Dec-1999 
C;Accession: T38506; S62511 



R; Jones , L.; Murphy, L. ; McNeil, A.; Simpson, I.; Harris, D. ; Barrell, E.G.; 
Rajandream, M.A. ; Walsh, S.V. 

submitted to the EMBL Data Library, October 1995 
A; Reference number: Z21798 
A; Accession : T3 8506 

A; Status : preliminary; translated from GB/EMBL/DDBJ 
A /Molecule type: DNA 
A;Residues: 1-187 <J02> 

A; Cross-references : EMBL:Z66525; NID :gl044926 ; PIDM: CAA91427 . 1 ; PID:gl044931 ; 
GSPDB:GN00066; SPDB : SPAC2 9E6 . 05c 

A; Experimental source: strain 972h-; cosmid c2 9E6 
C;Genetics : 

A; Gene: SPDB : SPAC2 9E6 .05c 
A; Map position: 1 

C; Super family: peptide methionine sulfoxide reductase 

Query Match 64.8%; Score 35; DB 2; Length 18 7; 

Best Local Similarity 44.4%; Pred. No. 29; 

Matches 4; Conservative 5; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

|:||::::| 
Db 159 CSSRMNIKC 167 



RESULT 15 
S37312 

transcription activator hlyT NhaR VC0677 [similarity] - Vibrio cholerae (strain 
N16961 serogroup 01) 
C;Species: Vibrio cholerae 

C;Date: 10-Sep-1999 #sequence_revision 10-Sep-1999 #text_change 02-Feb-2001 
C;Accession: S37312; G82292 

R/Williams, S.G.; Attridge, S.R.; Manning, P. A. 
Mol. Microbiol. 9, 751-760, 1993 

A; Title: The transcriptional activator HlyU of Vibrio cholerae: nucleotide 
sequence and role in virulence gene expression. 
A;Reference number: S37312; MUID: 94049116 ; PMID: 8231807 
A; Access ion: S3 73 12 

A;Status: nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A;Residues: 1-296 <WIL> 

A; Cross-references : EMBL:X66866; NID:g403330; PID:g403331 

A;Note: the nucleotide sequence was submitted to the EMBL Data Library, June 
1992 

R;Heidelberg, J.F.; Eisen, J. A.; Nelson, W.C; Clayton, R.A.; Gwinn, M.L. ; 
Dodson, R.J.; Haft, D.H, ; Hickey, E.K.; Peterson/ J.D.; Umayam, L.A.; Gill, 
S.R.; Nelson, K.E.; Read, T.D.; Tettelin, H. ; Richardson, D.; Ermolaeva, M.D.; 
Vamathevan, J.; Bass, S.; Qin, H.; Dragoi, I.; Sellers, P.; McDonald, L.; 
Utterback, T.; Fleishmann, R.D.; Nierman, W.C.; White, 0.; Salzberg, S.L.; 
Smith, H.O.; Colwell, R.R.; Mekalanos, J.J.; Venter, J.C.; Fraser, CM. 
Nature 406, 477-483, 2000 

A;Title: DNA Sequence of both chromosomes of the cholera pathogen Vibrio 
cholerae . 

A; Reference number: A82035; MUID : 20406833 ; PMID : 10952301 
A; Access ion: G82292 
A; Status : preliminary 
A; Molecule type: DNA 



A/Residues: 1-296 <HEI> 

A; Cross-references: GB:AE004154; GB:AE003852; NID:g9655115 ; PIDN : AAF93842 . 1 ; 
GSPDB:GN00126; TIGR:VC0677 

A; Experimental source: serogroup 01; strain N16961; biotype El Tor 

C;Genetics : 

A; Gene: hlyT; VC0677 

A ; Map position: 1 

C;Superfamily : regulatory protein nhaR 
C;Keywords: DNA binding; transcription regulation 
F; 21-40/Region: helix- turn-helix motif 

Query Match 64.8%; Score 35; DB 1; Length 2 96; 

Best Local Similarity 62.5%; Pred. No. 43; 

Matches 5; Conservative 3; Mismatches 0; Indels 0; Gaps 

Qy 2 NSRLHLRC 9 

::hllll 
Db 119 DNRIHLRC 126 



Search completed: November 13, 2 003, 09:52:49 
Job time : 11.375 sees 

GenCore version 5.1.6 

Copyright (c) 1993 - 2 003 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



November 13, 2003, 09:31:40 ; Search time 5.15625 Seconds 

(without alignments) 
82.083 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched : 



US-09-228-866-1 
54 

1 CNSRLHLRC 9 
BL0SUM62 

Gapop 10.0 , Gapext 0.5 
127863 seqs, 47026705 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



127863 



Database 



SwissProt 41:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
AS 14 HUMAN 



ID AS14_HUMAN STANDARD; PRT; 3 02 AA. 

AC Q8WXK2 ; 

DT 28-FEB-2003 (Rel. 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Ankyrin repeat and SOCS box containing protein 14 (ASB-14) . 

GN ASB14 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCB I_TaxI D= 9 6 0 6 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Kile B.T., Nicola N.A.; 

RT "SOCS box proteins."; 

RL Submitted (JUL-2001) to the EMBL/ GenBank/DDBJ databases. 

CC -!- SIMILARITY: Contains 4 ANK repeats. 

CC -!- SIMILARITY: Contains 1 SOCS box domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF403032; AAL57351.1; -. 

DR Genew; HGNC: 19766; ASB14 . 

DR InterPro; IPR002110; ANK. 

DR InterPro; IPR001496; SOCS. 

DR Pfam; PF00023; ank; 3. 

DR SMART; SM00248; ANK; 4. 

DR PROSITE; PS50088; AN K_RE P EAT ; 2. 

DR PROSITE; PS50297; ANK_REP_REGI ON ; 1. 

DR PROSITE; PS50225; SOCS; 1. 

KW ANK repeat; Repeat. 



FT REPEAT 


28 


57 


ANK 1. 


FT REPEAT 


70 


99 


ANK 2. 


FT REPEAT 


100 


12 9 


ANK 3. 


FT REPEAT 


131 


164 


ANK 4. 


FT DOMAIN 


236 


291 


SOCS BOX. 


SQ SEQUENCE 


302 AA; 


34562 


MW; 0B8C6E7219E9EF7B CRC64 ; 


Query Match 




77.8 


%; Score 42; DB 1; Length 3 02; 


Best Local 


Similarity 


77.8 


%; Pred. No. 0.76; 


Matches 


7; Conservative 


0; Mismatches 2; Indels 


Qy 1 


CNSRLHLRC 


9 




Db 260 


i linn 

CMGRLHLRC 


268 





RESULT 2 
DIAJDROME 

ID DIA_DROME STANDARD; PRT; 1091 AA. 

AC P48608; Q9VIJ7; 



DT 01-FEB-1996 (Rel . 33, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Diaphanous protein. 

GN DIA OR CG1768. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID=7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=95121197; PubMed=782 1209 ; 

RA Castrillon D.H., Wasserman S.A.; 

RT "Diaphanous is required for cytokinesis in Drosophila and shares 

RT domains of similarity with the products of the limb deformity gene."; 

RL Development 120:3367-3377(1994), 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Berkeley; 

RX MEDLINE=20196006; PubMed=1073 1132 ; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A. , Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R. , Yandell M.D., Zhang Q. , Chen L.X. , 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle C. , Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch C, Baldwin D. , 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L. , Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D. , Bolshakov S., 

RA Borkova D., Botchan M.R. , Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A. , Butler H. , Cadieu E., Center A. , Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K. , Doup L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W. , 

RA Fosler C. , Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K. , 

RA Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M. , 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A. , Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M. , Kalush F . , Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A. , 

RA Kimmel B.E., Kodira CD., Kraft C. , Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y. , Levi t sky A. A. , Li J., Li Z., Liang Y., Lin X., 

RA Liu X., Mattei B . , Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G. , Milshina N.V., Mobarry C. , Morris J., Moshrefi A., 

RA Mount S.M., Moy M. , Murphy B., Murphy L. , Muzny D.M. , Nelson D.L., 

RA Nelson D.R., Nelson K.A. , Nixon K. , Nusskern D.R., Pacleb J.M., 

RA Palazzolo M. , Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert K. , Remington K. , Saunders R.D.C., Scheeler F. , Shen H. , 

RA Shue B.C., Siden-Kiamos I., Simpson M. , Skupski M.P., Smith T. , 

RA Spier E. , Spradling A.C., Stapleton M. , Strong R. , Sun E., 

RA Svirskas R. , Tec tor C. , Turner R., Venter E , , Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M. , Weissenbach J., 

RA Williams S.M., Woodage T. , Worley K.C., Wu D . , Yang S., Yao Q.A. , 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M. , Zhang G. , Zhao Q. , Zheng L. , 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O. , 

RA Gibbs R.A., Myers E.W., Rubin G.M. , Venter J.C.; 



RT "The genome sequence of Drosophila melanogaster . " ; 

RL Science 287:2185-2195(2000). 

RN [3] 

RP FUNCTION. 

RX MEDLINE=20214846; PubMed=l 0751 177 ; 

RA Afshar K. , Stuart B., Wasserman S.A.; 

RT "Functional analysis of the Drosophila diaphanous FH protein in early 

RT embryonic development . " ; 

RL Development 127:1887-1897(2000). 

CC -!- FUNCTION: REQUIRED FOR CYTOKINESIS IN BOTH MITOSIS AND MEIOSIS. 

CC HAS A ROLE IN ACTIN CYTOSKELETON ORGANIZATION AND IS ESSENTIAL FOR 

CC MANY, IF NOT ALL, ACT IN -MEDIATED EVENTS INVOLVING MEMBRANE 

CC INVAGINATION. MAY SERVE AS A MEDIATOR BETWEEN SIGNALING MOLECULES 

CC AND ACTIN ORGANIZERS AT SPECIFIC PHASES OF THE CELL CYCLE . 

CC POSSIBLE COMPONENT OF THE CONTRACTILE RING OR MAY CONTROL ITS 

CC FUNCTION. 

CC -!- SUBCELLULAR LOCATION: LOCALIZES TO THE SITE WHERE THE METAPHASE 
CC FURROW IS ANTICIPATED TO FORM, TO THE GROWING TIP OF 

CC C ELLULAR I Z AT I ON FURROWS, AND TO CONTRACTILE RINGS. 

CC -!- DOMAIN: DRFS ARE REGULATED BY INTRAMOLECULAR GBD-DAD BINDING WHERE 

CC * RHO-GTP ACTIVATES THE DRFS BY DISRUPTING THE GBD-DAD INTERACTION 
CC (BY SIMILARITY) . 

CC -!- SIMILARITY: Contains 1 GTPase-binding (GBD) domain. 

CC -!- SIMILARITY: Contains 1 Formin homology 1 (FH1) domain. 

CC -!- SIMILARITY: Contains 1 Formin homology 2 (FH2) domain. 

CC -!- SIMILARITY: Contains 1 Formin homology 3 (FH3) domain. 

CC -!- SIMILARITY: Contains 1 DRF autoregulatory (DAD) domain. 

CC -!- SIMILARITY: BELONGS TO THE FORMIN HOMOLOGY FAMILY. DIAPHANOUS 

CC SUBFAMILY. 

cc :~" 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

cc 

DR EMBL; U11288; AAA67715.1; -. 

DR EMBL; AE003668; AAF53922.1; 

DR PIR; T13170; T13170. 

DR FlyBase; FBgn0011202; dia . 

DR InterPro; IPR003104; FH2 . 

DR Pfam; PF02181; FH2 ; 1. 

DR SMART; SM00498; FH2 ; 1. 

KW Cell division; Coiled coil. 



FT 


DOMAIN 


47 


242 


GBD. 


FT 


DOMAIN 


143 


448 


FH3 . 


FT 


DOMAIN 


446 


500 


COILED COIL (POTENTIAL) 


FT 


DOMAIN 


512 


596 


FH1 (PRO-RICH) . 


FT 


DOMAIN 


601 


1044 


FH2 . 


FT 


DOMAIN 


967 


1021 


COILED COIL (POTENTIAL) 


FT 


DOMAIN 


1027 


1041 


DAD. 


FT 


DOMAIN 


1050 


1053 


ARG-RICH (BASIC) . 


FT 


DOMAIN 


512 


518 


POLY- PRO. 


FT 


DOMAIN 


519 


522 


POLY-GLY. 


FT 


DOMAIN 


524 


532 


POLY- PRO. 



FT 


DOMAIN 


539 


548 


POLY -PRO. 


FT 


DOMAIN 


554 


561 


POLY -PRO . 


FT 


DOMAIN 


566 


572 


POLY- PRO. 


FT 


DOMAIN 


581 


585 


POLY -PRO. 


FT 


CONFLICT 


733 


733 


H -> Q (IN REF. 1) . 


SQ 


SEQUENCE 


1091 


AA; 123170 


MW; A4379D7A08 9B5EE7 



Query Match 72.2%; Score 39; DB 1; Length 1091; 

Best Local Similarity 87.5%; Pred. No. 11; 

Matches 7; Conservative 0; Mismatches 1; Indels 0; Gaps 

Qy 2 NSRLHLRC 9 

I MINI 
Db 3 07 NFRLHLRC 314 



RESULT 3 
KRAFJDROME 

ID KRAF_DROME STANDARD; PRT; 781 AA. 

AC P11346; 

DT 01-JUL-1989 (Rel. 11, Created) 

DT 01 -APR- 19 93 (Rel. 25, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE RAF homolog serine/ threonine-protein kinase dRAF-1 (EC 2.7.1.-) 

DE (Pole-hole protein) . 

GN PHL OR DRAF-1 OR D-RAF. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota; Dipt era; Brachycera; Muscomorpha ; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID-7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=88283647; PubMed=3135l83 ; 

RA Nishida Y. , Hata M., Ayaki T. , Ryo H. , Yamagata M. , Shimizu K. , 

RA Nishizuka Y. ; 

RT "Proliferation of both somatic and germ cells is affected in the 

RT Drosophila mutants of raf proto-oncogene . " ; 

RL EMBO J. 7:775-781(1988). 

RN [2] 

RP SEQUENCE OF 465-753 FROM N.A. 

RX MEDLINE=87257926; PubMed=3 03734 6 ; 

RA Mark G.E., Macintyre R.J., Digan M.E., Ambrosio L. , Perrimon N. ; 

RT "Drosophila melanogaster homologs of the raf oncogene."; 

RL Mol. Cell. Biol. 7:2134-2140(1987). 

RN [3] 

RP CHARACTER I ZAT I ON . 

RX MEDLINE=93140754; PubMed-8423 783 ; 

RA Sprenger F . , Torsoclair M.M., Morrison D.K. ; 

RT "Biochemical analysis of torso and D-raf during Drosophila 

RT embryogenesis : implications for terminal signal transduction."; 

RL Mol. Cell. Biol. 13:1163-1172(1993). 

CC -!- FUNCTION: SERINE/THREONINE KINASE REQUIRED IN THE EARLY EMBRYO 
CC FOR THE FORMATION OF TERMINAL STRUCTURE. ALSO REQUIRED DURING 

CC THE PROLIFERATION OF IMAGINAL CELLS. MAY ACT DOWNSTREAM OF RAS1 

CC IN THE SEV SIGNAL TRANSDUCTION PATHWAY . 

CC -!- PTM : EXTENSIVELY PHOSPHOR YLATED AT 1 TO 2 H AFTER EGG LAYING. 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
KW 



IMP. 



-!- SIMILARITY: BELONGS TO THE SER/THR FAMILY OF PROTEIN KINASES . 

MIL/RAF SUBFAMILY. 
-!- SIMILARITY: Contains 1 zinc -dependent phorbol -ester and DAG 

binding domain. 
-!- SIMILARITY: Contains 1 Ras-binding (RBD) domain. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformat ics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib, ch) . 

EMBL; X07181; CAA30166.1; ALT_INIT. 
EMBL; M16598; -; NOT_ANNOTATED_CDS . 
HSSP; P04049; 1RFA . 
FlyBase; FBgn0003079; phi. 

GO; GO: 0008069; P : dorsal /ventral axis determination, follicul. . 
GO; GO: 0007369; P : gastrulat ion ; NAS . 
GO; GO: 0007283; P : spermatogenesis ; IMP. 
GO; GO:0007362; P:terminal region determination; IMP. 
InterPro; IPR002219; DAG_PE-bind. 
InterPro; IPR000719; Prot_kinase. 
InterPro; IPR003116; RBD. 
InterPro; IPR002290; Ser_thr_pkinase . 
Pfam; PF00130; DAG_PE-bind; 1. 
Pfam; PF00069; pkinase; 1. 
Pfam; PF02196; RBD; 1. 
ProDom; PD000001; Prot_kinase; 1. 
SMART; SM00109; CI; 1. 
SMART; SM00455; RBD; 1. 

PROSITE; PS00479; DAG_PE_BIND_DOM_l ; 1. 
PROSITE; PS50081; DAG_PE_BIND__DOM_2 ; 1. 
PROSITE; PS 00 107; PROTE I N_KI NASE_ATP ; 1. 
PROSITE; PS50011; PROTE I N_KI NASE_DOM ; 1. 
PROSITE; PS00108; PROTEIN_KINASE_ST ; 1. 
PROSITE; PS50898; RBD; 1. 

Transferase; Serine/threonine-protein kinase; ATP-binding; Zinc; 
Phorbol -ester binding; Phosphorylation. 



FT 


DOMAIN 


183 


254 


RAS-BINDING. 


FT 


DOMAIN 


265 


310 


PHORBOL -ESTER AND DAG BINDING 


FT 


DOMAIN 


471 


732 


PROTEIN KINASE. 


FT 


NP_BIND 


477 


485 


ATP (BY SIMILARITY) . 


FT 


BINDING 


497 


497 


ATP (BY SIMILARITY) . 


FT 


ACT_SITE 


590 


590 


BY SIMILARITY. 


FT 


CONFLICT 


495 


495 


P - > A (IN REF. 2) . 


FT 


CONFLICT 


520 


522 


KKT -> RKA (IN REF. 2) . 


FT 


CONFLICT 


571 


571 


G -> R (IN REF. 2) . 


FT 


CONFLICT 


700 


703 


RRHS -> PQAL (IN REF. 2) . 


SQ 


SEQUENCE 


781 AA; 


88794 


MW; DEAD54762249EADC CRC64 ; 


Query Match 




70. 4* 


k; Score 38; DB 1 ; Length 781; 


Best Local Similarity 


66. 7* 


k; Pred. No. 12; 


Matches 6; 


Conservative 


0; Mismatches 3; Indels 



0 ; Gaps 



0; 



Qy 



1 CNSRLHLRC 9 



II I I II 

Db 294 CNFRFHQRC 302 

RESULT 4 
YM4 9_YEAST 

ID YM4 9_YEAST STANDARD ; PRT; 431 AA. 

AC Q03236; 

DT 01-NOV-1997 (Rel . 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 15-SEP-2 003 (Rel. 42, Last annotation update) 

DE Hypothetical 50.3 kDa protein in HSC82-GCV2 intergenic region. 

GN YMR187C OR YM8010 . 17C . 

OS Saccharomyces cerevisiae (Baker's yeast). 

OC Eukaryota; Fungi; Ascomycota; Saccharomycotina ; Saccharomycetes ; 

OC Saccharomycetales ; Saccharomycetaceae ; Saccharomyces . 

OX NCBI_TaxID=4932 ; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC STRAIN=S288c / AB972; 

RX PubMed=9169872; 

RA Bowman S., Churcher CM., Badcock K. , Brown D., Chillingworth T. , 

RA Connor R., Dedman K. , Devlin K. , Gentles S., Hamlin N. , Hunt S., 

RA Jagels K. , Lye G., Moule S., Odell C. , Pearson D., Rajandream M.A., 

RA Rice P., Skelton J., Walsh S., Whitehead S., Barrell B.G.; 

RT "The nucleotide sequence of Saccharomyces cerevisiae chromosome 

RT XIII." ; 

RL Nature 387:90-93(1997). 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein (Potential). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

cc 

DR EMBL; Z49808; CAA89920.1; 

DR PIR; S55134; S55134 . 

DR SGD; S0004799; YMR187C. 

KW Hypothetical protein; Transmembrane. 

FT TRANSMEM 228 24 8 POTENTIAL. 

FT TRANSMEM 279 299 POTENTIAL. 

FT TRANSMEM 34 9 369 POTENTIAL. 

FT TRANSMEM 388 4 08 POTENTIAL. 

SQ SEQUENCE 431 AA; 50287 MW; 61 165A68 4 55B92F1 CRC64 ; 

Query Match 66.7%; Score 36; DB 1; Length 431; 

Best Local Similarity 55.6%; Pred. No. 15; 

Matches 5; Conservative 2; Mismatches 2; Indels 0; Gaps 0; 
Qy 1 CNSRLHLRC 9 

Db 37 CNLQIHKRC 45 



RESULT 5 
Z335_HUMAN 

ID Z335_HUMAN STANDARD; PRT; 1342 AA. 

AC Q9H4Z2; Q9H684; 

DT 28-FEB-2003 (Rel . 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE Zinc finger protein 335. 

GN ZNF33 5. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=21638749; PubMed=11780052 ; 

RA Deloukas P., Matthews L.H., Ashurst J., Burton J. , Gilbert J.G.R., 

RA Jones M. , Stavrides G. , Almeida J. P., Babbage A.K., Bagguley C.L., 

RA Bailey J., Barlow K.F., Bates K.N. , Beard L.M. , Beare D.M. , 

RA Beasley O.P., Bird C.P., Blakey S.E., Bridgeman A.M., Brown A.J., 

RA Buck D., Burrill W.D., Butler A. P., Carder C. , Carter N.P., 

RA Chapman J.C., Clamp M. , Clark G. , Clark L.N., Clark S.Y., Clee CM., 

RA Clegg S., Cobley V.E., Collier R.E., Connor R.E., Corby N.R., 

RA Coulson A., Coville G.J. , Deadman R. , Dhami P.D., Dunn M . , 

RA Ellington A.G., Frankland J. A., Fraser A., French L., Garner P., 

RA Grafham D.V., Griffiths C, Griffiths M.N.D., Gwilliam R. , Hall R.E., 

RA Hammond S., Harley J.L., Heath P.D., Ho S., Holden J.L., Howden P.J., 

RA Huckle E., Hunt A . R . , Hunt S.E., Jekosch K. , Johnson CM. , Johnson D. , 

RA KayM.P., Kimberley A. M. , King A. , Knights A. , Laird G.K., LawlorS., 

RA Lehvaeslaiho M.H., Leversha M.A., Lloyd C, Lloyd D.M. , Lovell J.D., 

RA Marsh V.L. , Martin S.L., McConnachie L.J. , McLay K. , McMurray A. A. , 

RA Milne S.A., Mistry D., Moore M.J.F., Mullikin J.C, Nickerson T. , 

RA Oliver K. , Parker A. , Patel R., Pearce T.A.V., Peck A. I., 

RA Phillimore B.J.CT., Prathalingam S.R., Plumb R.W., Ramsay H., 

RA Rice CM., Ross M.T. , Scott C.E., Sehra H.K., Shownkeen R. , Sims S., 

RA Skuce CD., Smith M.L., Soderlund C, Steward CA. , Sulston J.E., 

RA Swann R.M. , Sycamore N. , Taylor R. , Tee L., Thomas D.W., Thorpe A., 

RA TraceyA., Tromans A.C, Vaudin M . , Wall M., Wallis J.M., 

RA Whitehead S.L., Whittaker P., Willey D.L., Williams L. , Williams S.A., 

RA Wilming L. , Wray P.W., Hubbard T. , Durbin R.M., Bentley D.R., Beck S., 

RA Rogers J. ; 

RT "The DNA sequence and comparative analysis of human chromosome 20."; 

RL Nature 414:865-871(2001). 
RN [2] 

RP SEQUENCE OF 455-1342 FROM N.A. 

RA Kawabata A., Hikiji T. , Kobatake N., Inagaki H. , Ikema Y., Okamoto S., 

RA Okitani R., Ota T. , Suzuki Y. , Obayashi M . , Nishi T. , Shibahara T. , 

RA Tanaka T. , Nakamura Y. , Isogai T. , Sugano S.; 
RT "NEDO human cDNA sequencing project."; 

RL Submitted (AUG-2000) to the EMBL/ GenBank/ DDB J databases. 
CC -!- FUNCTION: MAY FUNCTION AS A TRANSCRIPTION FACTOR. 
CC -!- SUBCELLULAR LOCATION: Nuclear (Probable). 

cc 7" 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormat ics and the EMBL outstation - 
CC the European Bioinf ormat ics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 



cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement {See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib. ch) . 

EMBL ; AL162458; CAC10457.1; -. 

EMBL; AK026157; BAB15379.1; ALT_INIT . 

Genew; HGNC: 158 07 ; ZNF335. 

InterPro; IPR007087; Znf_C2H2 . 

Pfam; PF00096; zf-C2H2; 13. 

SMART; SM00355; ZnF_C2H2 ; 13. 

PROSITE; PS00028; ZINC_FINGER_C2H2_1 ; 6. 

PROSITE; PS50157; ZINC_FINGER_C2H2_2 ; 13. 

Transcription regulation; Zinc-finger; Metal -binding; Nuclear protein; 
DNA-binding ; Repeat . 

ZN_FING 245 268 C2H2 -TYPE . 

ZN_FING 465 487 C2H2 -TYPE . 

ZN_FING 4 95 517 C2H2 -TYPE . 

ZN_FING 523 545 C2H2 -TYPE . 

ZN_FING 562 584 C2H2 -TYPE . 

ZN_FING 5 90 612 C2H2 -TYPE . 

ZN_FING 621 643 C2H2 - TYPE . 

ZN_FING 64 9 672 C2H2 -TYPE . 

ZN_FING 678 701 C2H2 -TYPE . 

ZN_FING 1019 1041 C2H2 -TYPE . 

ZN_FING 1047 1069 C2H2 -TYPE . 

ZN_FING 1075 1097 C2H2 -TYPE . 

ZN_FING 1103 1126 C2H2 -TYPE . 

DOMAIN 1178 1330 GLN-RICH . 

SEQUENCE ■ 1342 AA; 144892 MW; 6D230DEEOB3AE670 CRC64 ; 



Query Match 66.7%; 
Best Local Similarity 75.0%; 
Matches 6; Conservative 



Score 36; DB 1; 
Pred. No. 50; 
1; Mismatches 



Length 1342; 
1; Indels 



0; Gaps 



0; 



Qy 

Db 



2 NSRLHLRC 9 

I llhll 
692 NLRLHVRC 699 



RESULT 6 
R1AB_CVHSA 

ID R1AB_CVHSA STANDARD; PRT; 7073 AA. 

AC P59641; 

DT 15-SEP-2003 (Rel . 42, Created) 

DT 15-SEP-2003 (Rel. 42, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE Replicase polyprotein lab (pplab) (ORF1AB) [Includes: Replicase 

DE polyprotein la (ppla) (ORF1A) ] [Contains: Leader protein; p65 homolog; 

DE Papain-like proteinase (EC 3.4.24.-) (NSP1) ; 3C-like proteinase 

DE (EC 3.4.24.-) (3CL-PRO) (NSP2) ; HD2 (NSP3) ; NSP4 ; NSP5 ; NSP6 ; Growth 

DE factor-like (NSP7) ; RNA-directed RNA polymerase (EC 2.7.7.48) (RdRp) 

DE (NSP9); Helicase (Hel) (NSP10) ; NSP11; NSP12 ; NSP13] . 

OS Human coronavirus (strain SARS) (HCoV-SARS) . 

OC Viruses; ssRNA positive-strand viruses, no DNA stage; Nidovirales; 
OC Co r ona v iridae; Co r ona v i ru s . 
OX NCBI_TaxID=227859; 
RN [1] 



RP SEQUENCE FROM N.A. 

RC STRAIN=Isolate Urbani; 

RA Bellini W.J., Campagnoli R.P., Icenogle J. P., Monroe S.S., Nix W.A. , 

RA Oberste M.S., Pallansch M.A., Rota P.A. ; 

RL Submitted (APR-2003) to the EMBL/GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Isolate Tor2; 

RA Marra M., Jones S.J.M., Holt R. ; 

RT "The complete genome of the SARS associated coronavirus."; 

RL Submitted (APR-2003) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Isolate CUHK-W1; 

RA Tsui S.K.W., Lo D.Y.M., Tarn J.S., Fung K.P., Chim S.S.C., Au C.C., 

RA Chan A.H., Wan A.W.K., Au K.W. , Chan C.W. , Kou C.Y.C., Lam H.M. , 

RA Lam W.Y. , Lau S.K. , Lau Y.L. , Lau Y.M. , Law S.L. , Law T.W. , Li M.L. Y. , 

RA Tse C.H., Wong C.H., YiuW.H., LeeC.Y., Chan A.K.C., Chiu R.W.K., 

RA Ng E.K.O., Tong Y.K. , Chan P.K.S., Au-Yeung C, Cheung J.K.L., Chu I., 

RA Hung E.C.W., Waye M.M.Y.; 

RT " DNA sequence of a human coronavirus (CUHK-W1) from a patient with 

RT severe acute respiratory syndrome (SARS) in Hong Kong."; 

RL Submitted (APR-2003) to the EMBL/GenBank/DDBJ databases. 
RN [4] 

RP SEQUENCE FROM N.A. 

RC STRAIN-Isolate HKU-39849; 

RA Leung F.C., Zeng F. , Chan C.W.M., Chan C.M.Y., Chen J., Chow K.Y.C., 

RA Hon C.C.C., Hui R.K.H., Li J. , Li V.Y.Y., Wang Y.Y., Peiris J.S.M., 

RA Poon L . L . M . ; 

RL Submitted (APR-2003) to the EMBL/GenBank/DDBJ databases. 
RN [5] 

RP SEQUENCE OF 4993-5127 FROM N.A. 

RC STRAIN=Isolate Vietnam; 

RA Emery S., Erdman D., Peret T. , Ksiazek T. ; 

RL Submitted (APR-2003) to the EMBL/GenBank/DDBJ databases. 

RN [6] 

RP SEQUENCE OF 4993-5136 FROM N.A. 

RC STRAIN=Isolate Taiwan; 

RA Lin J.-H., ChiuS.-C, Yang J . - Y . , Wang S.-F., ChenH.-Y.; 

RT "Detection of a novel human coronavirus in a severe acute respiratory 

RT syndrome patient in Taiwan."; 

RL Submitted (APR-2003) to the EMBL/GenBank/DDBJ databases. 

CC -!- FUNCTION: The replicase polyprotein of coronaviruses is a 

CC multifunctional protein: it contains the activities necessary for 

CC the transcription of negative stranded RNA, leader RNA, subgenomic 

CC mRNAs and progeny virion RNA as well as proteinases responsible 

CC for the cleavage of the polyprotein into functional products (By 

CC similarity) . 

CC -!- CATALYTIC ACTIVITY: N nucleoside triphosphate = N diphosphate + 
CC {RNA} (N) . 

CC -!- PTM: Specific enzymatic cleavages in vivo yield mature proteins 
CC (By similarity) . 

CC -!- MISCELLANEOUS: This protein is translated as a 1A-1B polyprotein 

CC by a ribosomal f rameshif ting mechanism (By similarity) . 

CC -!- SIMILARITY: Contains 1 peptidase family C16 domain. 

CC -!- SIMILARITY: Contains 1 peptidase family C30 domain. 

cc 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
KW 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 



EMBL 


AY278741 


AAP13442 


1; 




EMBL 


AY278741 


AAP13439 


1; -. 




EMBL 


AY278741 


AAP13440 


1; ALT_ 


SEQ, 


EMBL 


AY274119 


• -; NOT ANNOTATED 


_CDS 


EMBL 


AY278554 


* AAP13566 


1; 




EMBL 


AY278554 


• AAP13575 


l; 




EMBL 


AY278491 


* -; NOT ANNOTATED 


__CDS 


EMBL 


AY269391 


* AAP04003 


l; 




EMBL 


• AY268049 


* AAP04587 


l; 





InterPro; IPR002589; Alpp . 
InterPro; IPR007095; RNAjpolJDSJPS . 
InterPro; I PRO 07 0 94; RNA_pol_PSvir . 
InterPro; IPR002877; FtsJ. 
Pfam; PF01661; Alpp; 1. 
Pfam; PF01728; FtsJ; 1. 
SMART; SM00506; Alpp; 1. 

Polyprotein; Transferase; RNA-directed RNA polymerase; Thiol protease; 
Hydrolase; Helicase; ATP-binding. 



FT 


DOMAIN 


1 


179 


LEADER PROTEIN (POTENTIAL) . 


FT 


DOMAIN 


180 


818 


P65 HOMOLOG (POTENTIAL) . 


FT 


DOMAIN 


■? 




PAPAIN -LIKE PROTEINASE (POTENTIAL) . 


FT 


DOMAIN 


3240 


3547 


3C-LIKE PROTEINASE (POTENTIAL) . 


FT 


DOMAIN 


3548 


3836 


HD2/NSP3 (POTENTIAL) . 


FT 


DOMAIN 


3837 


3919 


NSP4 (POTENTIAL) . 


FT 


DOMAIN 


3920 


4117 


NSP5 (POTENTIAL) . 


FT 


DOMAIN 


4118 


4229 


NSP6 (POTENTIAL) . 


FT 


DOMAIN 


4230 


4369 


GROWTH FACTOR-LIKE (POTENTIAL) . 


FT 


DOMAIN 


4370 


5301 


RNA-DIRECTED RNA POLYMERASE (POTENTIAL) 


FT 


DOMAIN 


5302 


5902 


HELICASE (POTENTIAL) . 


FT 


DOMAIN 


5903 


6429 


NSP11 (POTENTIAL) . 


FT 


DOMAIN 


6430 


6775 


NSP12 (POTENTIAL) . 


FT 


DOMAIN 


6776 


7073 


NSP13 (POTENTIAL) . 


FT 


ACTJ3ITE 


1909 


1909 


POTENTIAL. 


FT 


NP_BIND 


5583 


5590 


ATP (POTENTIAL) . 


FT 


DOMAIN 


930 


933 


POLY-GLU. 


FT 


DOMAIN 


937 


942 


POLY-GLU. 


FT 


DOMAIN 


974 


979 


POLY-GLU. 


FT 


DOMAIN 


2210 


2213 


POLY -LEU. 


FT 


DOMAIN 


3766 


3769 


POLY-CYS . 


FT 


VARIANT 


2552 


2552 


V -> A (in isolates Tor2 , CUHK-W1 and 


FT 








HKU-39849) . 


FT 


VARIANT 


2556 


2556 


D -> N (in isolate HKU-39849) . 


FT 


VARIANT 


2708 


2708 


S -> T (in isolate HKU-39849) . 


FT 


VARIANT 


2718 


2718 


R -> T (in isolate HKU-39849) . 


FT 


VARIANT 


3047 


3047 


V -> A (in isolate CUHK-W1) . 


FT 


VARIANT 


3072 


3072 


V -> A (in isolate CUHK-W1) . 


FT 


VARIANT 


4379 


4382 


RVCG -> GFAV (in ORF1A) . 


FT 


VARIANT 


5131 


5131 


A -> G (in isolate Taiwan) . 


FT 


VARIANT 


5134 


5135 


CY -> VL (in isolate Taiwan) . 



FT VARIANT 

FT VARIANT 

FT VARIANT 

SQ SEQUENCE 



5767 5767 D -> E (in isolate CUHK-W1) 

6778 6778 Q -> R (in isolate Tor2) . 

6883 6883 D -> Y (in isolate Tor2) . 

7073 AA; 790270 MW; A9 1B3CE92 0E69D4C CRC64 ; 



Query Match 66.7%; 
Best Local Similarity 66.7%; 
Matches 6 ; Conservative 



Score 36; DB 1; 
Pred. No. 3e+02; 
1 ; Mismatches 



Length 7 073; 
2; Indels 0; 



Gaps 



Qy 1 CNSRLHLRC 9 

Ilh III 
Db 53 09 CNSQTSLRC 5317 



RESULT 7 
NHAR_VIBCH 

ID NHAR_VIBCH STANDARD; PRT; 296 AA. 

AC P52692; Q9JMP8; 

DT 01-OCT-1996 (Rel. 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Transcriptional activator protein nhaR (Na+/H+ antiporter regulatory 

DE protein) . 

GN NHAR OR HLYT OR VC0677. 

OS Vibrio cholerae. 

OC Bacteria; Proteobacteria ; Gammaproteobacteria; Vibrionales; 

OC Vibrionaceae; Vibrio. 

OX NCBI JTaxID=666 ; 

RN [1] 

RP SEQUENCE FROM N . A . 

RC STRAIN=Classical Inaba Z17561 / Serotype 01; 

RX MEDLINE=94 04 9116; PubMed-823 18 07 ; 

RA Williams S.G., Attridge S.R., Manning P.A.; 

RT "The transcriptional activator HlyU of Vibrio cholerae: nucleotide 

RT sequence and role in virulence gene expression."; 

RL Mol. Microbiol. 9:751-760(1993). 

RN [2] 

RP SEQUENCE FROM N . A . 

RC STRAIN=E1 Tor 017 / Serotype 01; 

RX MEDLINE=98117066; PubMed=945788 8 ; 

RA Williams S.G., Carmel-Harel 0., Manning P. A.; 

RT "A functional homolog of Escherichia coli NhaR in Vibrio cholerae."; 

RL J. Bacterid. 180:762-765(1998). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=E1 Tor N16961 / Serotype 01; 

RX MEDLINE=20406833; PubMed=10952301 ; 

RA Heidelberg J.F., Eisen J.A. , Nelson W.C., Clayton R.A., Gwinn M.L., 

RA Dodson R.J., Haft D.H., Hickey E.K. , Peterson J.D., Umayam L.A. , 

RA Gill S.R., Nelson K.E., Read T.D., Tettelin H. , Richardson D. , 

RA Ermolaeva M.D., Vamathevan J., Bass S., Qin H. , Dragoi I., Sellers P 

RA McDonald L. , Utterback T. , Fleischmann R.D. , Nierman W.C., White 0. , 

RA Salzberg S.L., Smith H.O. , Colwell R.R., Mekalanos J.J., Venter J.C. 

RA Fraser CM. ; 

RT "DNA sequence of both chromosomes of the cholera pathogen Vibrio 

RT cholerae."; 

RL Nature 406:477-483(2000). 



CC -!- FUNCTION: PLAYS A ROLE IN THE POSITIVE REGULATION OF NHAA 
CC (PROBABLE) . 

CC -!- SUBCELLULAR LOCATION: Cytoplasmic. 

CC -!- SIMILARITY: BELONGS TO THE LYSR FAMILY OF TRANSCRIPTIONAL 
CC REGULATORS . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; X66S66; CAA4 733 5.1; 

DR EMBL; AJ002395; CAA05371.1; -. 

DR EMBL; AE004154; AAF93842.1; 

DR PIR; S37312; S37312. 

DR TIGR; VC0677; 

DR InterPro; IPR000847; HTH_LysR . 

DR Pfam; PF0012 6; HTH_1 ; 1. 

DR PROSITE; PS00044; HTH_LYSR_FAM I LY ; 1. 

KW Transcription regulation; DNA-binding; Activator; Complete proteome. 

FT DNA_BIND 21 40 H-T-H MOTIF (POTENTIAL) . 

SQ SEQUENCE 296 AA; 33559 MW; C7830B4B532DBC0C CRC64; 

Query Match 64.8%; Score 35; DB 1; Length 296; 

Best Local Similarity 62.5%; Pred. No. 15; 

Matches 5; Conservative 3; Mismatches 0; Indels 0; Gaps 0; 
Qy 2 NSRLHLRC 9 

Db 119 DNRIHLRC 126 



RESULT 8 




PHNM 


ECOLI 




ID 


_ PHNM_ECOLI STANDARD; PRT; 378 AA. 




AC 


P16689; 




DT 


01-AUG-1990 (Rel. 15, Created) 




DT 


01-NOV-1991 (Rel. 20, Last sequence update) 




DT 


16-OCT-2001 (Rel. 40, Last annotation update) 




DE 


PhnM protein. 




GN 


PHNM OR B4 095. 




OS 


Escherichia coli. 




OC 


Bacteria ; Proteobacteria; Gammaproteobacteria; 


Enterobacteriales ; 


OC 


Enterobacteriaceae; Escherichia . 




OX 


NCBI TaxID=562; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RC 


STRAIN=K12 ; 




RX 


MEDLINE=91193228; PubMed=1840580 ; 




RA 


Makino K. , Kim S.K. , Shinagawa H., Amemura M . , 


Nakata A. ; 


RT 


"Molecular analysis of the cryptic and functional phn operons for 


RT 


phosphonate use in Escherichia coli K-12."; 




RL 


J. Bacteriol. 173:2665-2672(1991). 




RN 


[2] 





RP SEQUENCE FROM N.A. 

RC STRAIN=K12 / MG1655; 

RX MEDLINE=95334362; PubMed=761004 0 ; 

RA Burland V.D., Plunkett G. Ill, Sofia H.J., Daniels D.L., 

RA Blattner F.R. ; 

RT "Analysis of the Escherichia coli genome VI: DNA sequence of the 

RT region from 92.8 through 100 minutes."; 

RL Nucleic Acids Res. 23:2105-2119(1995). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=B ; 

RX MEDLINE=90170953; PubMed=2 15523 0 ; 

RA Chen C.-M., Ye Q.-Z., Zhu Z., Wanner B.L., Walsh C.T.; 

RT "Molecular biology of carbon-phosphorus bond cleavage. Cloning and 

RT sequencing of the phn (psiD) genes involved in alkylphosphonate 

RT uptake and C-P lyase activity in Escherichia coli B . " ; 

RL J. Biol. Chem. 265:4461-4471(1990). 

CC -!- FUNCTION: BELONGS TO AN OPERON INVOLVED IN ALKYLPHOSPHONATE 

CC UPTAKE AND C-P LYASE. EXACT FUNCTION NOT KNOWN. 

CC -!- MISCELLANEOUS: THE SEQUENCE SHOWN IS THAT OF STRAIN K12 . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch) . 

CC 

DR EMBL; D90227; BAA14273.1; 

DR EMBL; U14003; AAA96994.1; -. 

DR EMBL; AE000482; AAC77056.1; 

DR EMBL; J05260; AAA24352.1; 

DR PIR; S56323; S56323 . 

DR EcoGene; EG10722; phnM. 

DR InterPro; IPR00668 0; Amidohydro_l . 

DR Pfam; PF01979; Amidohydro^l ; 1. 

DR ProDom; PD000518; Urease; 1. 

KW Alkylphosphonate uptake; Complete proteome. 

FT VARIANT 318 318 Q -> E (IN STRAIN B) . 

SQ SEQUENCE 378 AA; 42010 MW; 28CC9C5C77EAD37D CRC64 ; 

Query Match 64.8%; Score 35; DB 1; Length 378; 

Best Local Similarity 100.0%; Pred . No. 20; 

Matches 6; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 4 RLHLRC 9 

I M 1 1 1 

Db 132 RLHLRC 13 7 



RESULT 9 
TRYS_CRIFA 

ID TRYS_CRIFA STANDARD; PRT; 652 AA. 

AC 060993; 

DT 28-FEB-2003 (Rel . 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 



DT 28-FEB-2003 (Rel . 41, Last annotation update) 

DE Trypanothione synthetase (EC 6.3.1.9) (Cf-TS) . 

GN TRS . 

OS Crithidia fasciculata. 

OC Eukaryota; Euglenozoa; Kinetoplastida; Trypanosomatidae; Crithidia. 

OX NCBI__TaxID=5656 ; 

RN [1] 

RP SEQUENCE FROM N.A., AND SEQUENCE OF 168-191; 232-255; 437-450; 

RP 457-474; 592-610 AND 617-630. 

RC STRAIN-HS6 ; 

RX MEDLINE=98344022 ; PubMed=96773 55 ; 

RA Tetaud E . , Manai F., Barrett M.P., Nadeau K. , Walsh C.T. , 

RA Fairlamb A . H . ; 

RT "Cloning and characterization of the two enzymes responsible for 

RT trypanothione biosynthesis in Crithidia fasciculata."; 

RL J. Biol. Chem. 273:19383-19390(1998). 

RN [2] 

RP SEQUENCE OF 168-191; 232-255; 437-450; 457-474; 592-610 AND 617-630, 

RP AND CHARACTERIZATION. 

RX MEDLINE=932783 03; PubMed=13 04372 ; 

RA Smith K. , Nadeau K. , Bradley M. , Walsh C, Fairlamb A.H.; 

RT "Purification of glutathionylspermidine and trypanothione synthetases 

RT from Crithidia fasciculata."; 

RL Protein Sci. 1:874-883(1992). 

RN [3] 

RP SEQUENCE OF 17-36; 224-229; 235-242; 515-533; 550-561 AND 617-629, AND 

RP CHARACTERI ZATI ON . 

RX MEDLINE=97277330 ; PubMed=9115252 ; 

RA Koenig K. , Menge U. , Kiess M. , Wray V. , Flohe L. ; 

RT "Convenient isolation and kinetic mechanism of glutathionylspermidine 

RT synthetase from Crithidia fasciculata."; 

RL J. Biol. Chem. 272:11908-11915(1997). 

CC -!- FUNCTION: Conjugates glutathione ( gamma -Glu-Cys-Gly) and 

CC glutathionylspermidine to form trypanothione (N(1),N(8)~ 

CC bis (glutathionyl) spermidine) , which is involved in maintaining 

CC intracellular thiol redox and in defense against oxidants. 

CC -!- CATALYTIC ACTIVITY: Gamma -L -glutamyl -L-cysteinyl -glycine + N(l)- 

CC (gamma -L-glutamyl-L-cysteinyl-glycyl) -spermidine + ATP = 

CC N(l) ,N(8) -bis- (gamma-L-glutamyl-L-cysteinyl-glycyl) -spermidine + 

CC ADP + phosphate. 

CC -!- COFACTOR: Magnesium. 

CC -!- PTM: The N-Terminal is blocked. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
CC or send an email to license@isb-sib . ch) . 

cc 

DR EMBL; AF006615; AAC39132.1; -. 
DR InterPro; IPR0054 94; GSP_synth. 
DR Pfam; PF05257; AXE; 1. 
DR Pfam; PF03738; GSP_synth; 1. 
KW Ligase; Magnesium. 

SQ SEQUENCE 652 AA; 74516 MW; 321BE9 0D3 9EEEA8 0 CRC64 ; 



Query Match 64.8%; Score 35; DB 1; Length 652; 

Best Local Similarity 55.6%; Pred. No. 36; 

Matches 5; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 

h I I I I 
Db 268 CDHEFHLRC 276 



RESULT 10 
UBR1_KLULA 

ID UBR1_KLULA STANDARD; PRT; 1941 AA. 

AC 060014; 

DT 15-DEC-1998 (Rel . 37, Created) 

DT 15-DEC-1998 (Rel. 37, Last sequence update) 

DT 15-DEC-1998 (Rel. 37, Last annotation update) 

DE N-end-recognizing protein (Ubiquitin-protein ligase E3 component) (N- 

DE recognin) . 

GN UBR1 . 

OS Kluyveromyces lactis (Yeast) . 

OC Eukaryota; Fungi; Ascomycota; Saccharomycotina; Saccharomycetes ; 

OC Saccharomycetales; Saccharomycetaceae; Kluyveromyces . 

OX NCBI_TaxID=2 8 985; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Waller P.R.H., Varshavsky A. ; 

RL Submitted (APR- 19 98) to the EMBL/ GenBank/DDBJ databases. 

CC -!- FUNCTION: RECOGNITION COMPONENT OF THE N-END RULE PATHWAY. BINDS 

CC TO PROTEINS BEARING AM I NO -TERMINAL RESIDUES THAT ARE DESTABILIZING 

CC ACCORDING TO THE N-END RULE, BUT DOES NOT BIND TO OTHERWISE 

CC IDENTICAL PROTEINS BEARING STABILIZING AM I NO -TERMINAL RESIDUES. 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

cc 

DR EMBL; AF061554; AAC15841.1; -. 

DR PIR; T30554; T30554. 

DR InterPro; IPR003126; ZnfJSfrecognin . 

DR InterPro; IPR001841; Znf_ring. 

DR Pfam; PF02207; zf-UBRl; 1. 

DR SMART; SM00184; RING; 1. 

DR SMART; SM00396; ZnF_UBRl; 1. 

KW Ligase; Ubl conjugation pathway. 

SQ SEQUENCE 1941 AA; 223682 MW; 3 7C2E1BCA08 032 68 CRC64; 

Query Match 64.8%; Score 35; DB 1; Length 1941; 
Best Local Similarity 55.6%; Pred. No. l.le+02; 

Matches 5; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 CNSRLHLRC 9 
II :| II 



Db 1271 CNHAVHYRC 1279 



RESULT 11 
RL3 7_METTH 

ID RL3 7_METTH STANDARD; PRT; 60 AA. 

AC 026744; 

DT 15-DEC-1998 (Rel . 37, Created) 

DT 15-DEC-1998 (Rel. 37, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE SOS ribosomal protein L37e. 

GN RPL37E OR MTH648 . 

OS Methanobacterium thermoautotrophicum. 

OC Archaea; Euryarchaeota; Methanobacteria ; Methanobacteriales ; 

OC Methanobacteriaceae ; Methanothermobact er . 

OX NCBIJTaxID=18742 0; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Delta H; 

RX MEDLINE=98037514 ; PubMed=937 1463 ; 

RA Smith D.R., Doucette-Stamm L.A., Deloughery C. , Lee H.-M., Dubois J., 

RA Aldredge T., Bashirzadeh R. , Blakely D., Cook R. , Gilbert K. , 

RA Harrison D. # Hoang L. , Keagle P., Lumm W. , Pothier B., Qiu D., 

RA Spadafora R. , Vicare R., Wang Y. , Wierzbowski J., Gibson R. , 

RA Jiwani N., Caruso A., Bush D. , Safer H. , Patwell D. , Prabhakar S., 

RA McDougall S., Shimer G . , Goyal A. , Pietrovski S., Church G.M., 

RA Daniels C.J., Mao J. -I., Rice P., Noelling J., Reeve J.N.; 

RT "Complete genome sequence of Methanobacterium thermoautotrophicum 

RT deltaH: functional analysis and comparative genomics."; 

RL J. Bacteriol. 179:7135-7155(1997). 

CC -!- SIMILARITY: BELONGS TO THE L37E FAMILY OF RIBOSOMAL PROTEINS. 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

cc 

DR EMBL; AE000845; AAB85153.1; -. 

DR PIR; B69186; B69186. 

DR HAMAP; MF_00547; -; 1. 

DR InterPro; IPR001569; Ribosomal_L37E . 

DR Pfam; PF01907; Ribosomal_L3 7e ; 1. 

DR ProDom; PD005132; Ribosomal_L37E; 1. 

DR PROSITE; PS01077; RIBOSOMAL_L37E; 1. 

KW Ribosomal protein; Complete proteome . 

SQ SEQUENCE 60 AA; 7123 MW; 3B7026A57 9EAC9D5 CRC64 ; 

Query Match 63.0%; Score 34; DB 1; Length 60; 

Best Local Similarity 62.5%; Pred. No. 4.2; 

Matches 5; Conservative 1; Mismatches 2; Indels 0; Gaps 0; 
Qy 2 NSRLHLRC 9 

Db 11 NKNLHIRC 18 



RESULT 12 
ECR1_AERPE 

ID ECR1_AERPE STANDARD ; PRT; 235 AA. 

AC Q9YC02; 

DT 15-SEP-2003 (Rel . 42, Created) 

DT 15-SEP-2003 (Rel. 42, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE Probable exosome complex RNA-binding protein 1. 

GN APE1448. 

OS Ae ropy rum pernix. 

OC Archaea; Crenarchaeota ; Thermoprotei; Desulfurococcales; 

OC Desulfurococcaceae; Aeropyrum. 

OX NCBI_TaxID=56636; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=K1 ; 

RX MEDLINE=99310339; PubMed=10382 966 ; 

RA Kawarabayasi Y. , Hino Y., Horikawa H., Yamazaki S., Haikawa Y., 

RA Jin-no K. , Takahashi M . , Sekine M. , Baba S.-I., Ankai A., Kosugi H. , 

RA Hosoyama A,, Fukui S., Nagai Y., Nishijima K. , Nakazawa H. , 

RA Takamiya M. , Masuda S., Funahashi T. # Tanaka T. , Kudoh Y. , 

RA Yamazaki J., Kushida N., Oguchi A., Aoki K.-I., Kubota K. , 

RA Nakamura Y. , Nomura N., Sako Y., Kikuchi H.; 

RT "Complete genome sequence of an aerobic hyper -thermophilic 

RT crenarchaeon, Aeropyrum pernix Kl . " ; 

RL DNA Res. 6:83-101(1999). 

CC -!- FUNCTION: Probably involved in degradation of a variety of RNA 
CC species; could act a RNA-binding component of the exosome 

CC (Potential) . 

CC -!- SUBUNIT: Component of the archaeal exosome multienzyme 

CC ribonuclease complex (Potential) . 

CC -!- SUBCELLULAR LOCATION: Cytoplasmic (Potential). 

CC -!- SIMILARITY : Contains 1 KH domain. 

CC -!- SIMILARITY: Contains 1 SI motif domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; AP000061; BAA80446.1; -. 

DR PIR; H72623; H72623. 

DR HAMAP; MF_00623; -; 1. 

DR Inter Pro; I PRO 04 08 7; KH_dom. 

DR InterPro; IPR003029; SI. 

DR Pfam; PF00013; KH; 1. 

DR SMART; SM00322; KH; 1. 

DR SMART; SM00316; SI; 1. 

DR PROSITE; PS50084; KH_TYPE_1 ; FALSE_NEG. 

DR PROSITE; PS50126; SI; 1. 

KW Exosome; RNA-binding; Complete proteome. 

FT DOMAIN 67 13 9 SI MOTIF . 



FT DOMAIN 147 206 KH. 

SQ SEQUENCE 235 AA; 26060 MW; 70A79A5EB0BF8CE7 CRC64; 



Query Match 63 . 0%; 

Best Local Similarity 62.5%; 
Matches 5; Conservative 

Qy 2 NSRLHLRC 9 

I hll I 
Db 186 NGRIHLEC 193 



Score 34; DB 1 ; 
Pred. No. 18; 
1; Mismatches 



Length 235; 
2; Indels 



0 ; Gaps 



0; 



RESULT 13 
YQHQ_BACSU 

ID YQHQ_BACSU STANDARD; PRT; 318 AA. 

AC P54515; 

DT 01-OCT-1996 (Rel . 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Hypothetical protein yqhQ. 

GN YQHQ . 

OS Bacillus subtilis. 

OC Bacteria; Firmicutes ; Bacillales; Bacillaceae; Bacillus. 

OX NCBI_TaxID=1423 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=168 / JH642; 

RX MEDLINE-97124195; PubMed=8 96 9508 ; 

RA Mizuno M. , Masuda S., Takemaru K.-I., Hosono S., Sato T., Takeuchi M. , 

RA Kobayashi Y. ; 

RT "Systematic sequencing of the 283 kb 210 degrees-232 degrees region of 

RT the Bacillus subtilis genome containing the skin element and many 

RT sporulation genes . " ; 

RL Microbiology 142:3103-3111(1996). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=168; 

RX MEDLINE=98044033; PubMed-9384377 ; 

RA Kunst F . , Ogasawara N. , Moszer I., Albertini A.M., Alloni G., 

RA Azevedo V., Bertero M.G. , Bessieres P., Bolotin A., Borchert S., 

RA Borriss R., Boursier L., Brans A., Braun M . , Brignell S.C., Bron S., 

RA Brouillet S., Bruschi C.V. , Caldwell B., Capuano V., Carter N.M., 

RA Choi S.K., Codani J. J. , Connerton I.F., Cummings N.J., Daniel R.A. , 

RA Denizot F., Devine K.M., Dusterhoft A. , Ehrlich S.D., Emmerson P.T., 

RA Entian K.D., Errington J., Fabret C, Ferrari E. , Foulger D., 

RA Fritz C. , Fuj ita M. , Fuj ita Y. , Fuma S w Galizzi A., Galleron N., 

RA Ghim S.Y., Glaser P., Gof feau A. , Golightly E. J. , Grandi G. , 

RA Guiseppi G, , Guy B.J., Haga K. , Haiech J., Harwood C.R., Henaut A., 

RA Hilbert H., Holsappel S., Hosono S., Hullo M.F., Itaya M . , Jones L . , 

RA Joris B., Karamata D. , Kasahara Y. , Klaerr-Blanchard M. , Klein C. , 

RA Kobayashi Y. , Koetter P., Koningstein G. , Krogh S., Kumano M. , 

RA Kurita K. , Lapidus A., Lardinois S., Lauber J . , Lazarevic V., 

RA Lee S.M., Levine A., Liu H., Masuda S., Mauel C. , Medigue C, 

RA Medina N., Mellado R.P., Mizuno M. , Moestl D., Nakai S., Noback M., 

RA Noone D. , O'Reilly M., Ogawa K. , Ogiwara A., Oudega B., Park S.H., 

RA Parro V. , Pohl T.M., Portetelle D. , Porwollik S., Prescott A.M., 

RA Presecan E. , Puj ic P., Purnelle B. , Rapoport G., Rey M. , Reynolds S., 



RA 
RA 
RA 
RA 
RA 
RA 
RA 
RA 
RA 
RT 
RT 
RL 
CC 
CC 

cc 
cc 

CC 

cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
KW 
FT 
FT 
FT 
FT 
SQ 



Rieger 3VL, Rivolta C, Rocha E. , Roche B., Rose M. , Sadaie Y., 
Sato T., Scanlan E., Schleich S., Schroeter R., Scoffone F., 
Sekiguchi J. , Sekowska A., Seror S.J., Serror P., Shin B.S., Soldo B . , 
Sorokin A., Tacconi E. , Takagi T. , Takahashi H. , Takemaru K. , 
Takeuchi M . , Tamakoshi A., Tanaka T . , Terpstra P., Tognoni A., 
Tosato V., Uchiyama S., Vandenbol M., Vannier F., Vassarotti A., 
Viari A., Wambutt R. , Wedler E. , Wedler H., Weitzenegger T., 
Winters P . , Wipat A., Yamamoto H. , Yamane K. , Yasumoto K. , Yata K. , 
Yoshida K. , Yoshikawa H.F., Zumstein E., Yoshikawa H. , Danchin A.; 
"The complete genome sequence of the Gram-positive bacterium Bacillus 
subtil is . " ; 

Nature 390:249-256(1997). 

-!- SUBCELLULAR LOCATION: Integral membrane protein (Potential). 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; D84432; BAA12554.1; -. 
EMBL; Z99116; CAB14380.1; -. 
PIR; H69959; H69959. 
SubtiList; BG11705; yqhQ. 

Hypothetical protein; Transmembrane; Complete proteome. 



TRANSMEM 
TRANS MEM 
TRANSMEM 
TRANSMEM 
SEQUENCE 



112 
147 
209 
237 
318 AA; 



132 
167 
229 
257 



POTENTIAL . 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 



36001 MW; 1561 9B4 0274BB716 CRC64 ; 



Query Match 63 . 0%; 

Best Local Similarity 85.7%; 
Matches 6; Conservative 



Score 34; DB 1; 
Pred. No. 25; 
0; Mismatches 



Length 318; 
1; Indels 



0; Gaps 



0; 



QY 
Db 



3 SRLHLRC 9 

I I I I II 
2 03 SRLHYRC 209 



RESULT 14 
AS14_MOUSE 

ID AS14_M0USE STANDARD; PRT; 433 AA. 

AC Q8VHS7; 

DT 28-FEB-2003 (Rel. 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE Ankyrin repeat and SOCS box containing protein 14 (ASB-14) . 

GN ASB14 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata,- Euteleostomi ; 
OC Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus. 
OX NCB I JTaxI D= 10090; 
RN [1] 

RP SEQUENCE FROM N.A. 



RA Kile B.T., Nicola N.A.; 

RT "SOCS box proteins."; 

RL Submitted (JUL-2001) to the EMBL/ GenBank/DDBJ databases . 

CC -!- SIMILARITY : Contains 9 ANK repeats. 

CC -!- SIMILARITY: Contains 1 SOCS box domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinformat ics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC ~ 

DR EMBL; AF403042; AAL57361.1; -. 

DR MGD; MGI : 2655107 ; Asbl4 . 

DR InterPro; IPR002110; ANK. 

DR InterPro; IPR0014 96; SOCS. 

DR Pfam; PF00023; ank; 8. 

DR SMART; SM00248; ANK; 8. 



DR 


PROSITE; 


PS5 0088; ANK REPEAT; 6. 


DR 


PROSITE; 


PS50297; ANK REP 


REGION; 1. 


DR 


PROSITE; 


PS50225; 


SOCS; 1, 




KW 


ANK repeat; Repeat 






FT 


REPEAT 


1 


14 


ANK 1. 


FT 


REPEAT 


18 


47 


ANK 2 . 


FT 


REPEAT 


51 


80 


ANK 3 . 


FT 


REPEAT 


94 


123 


ANK 4 . 


FT 


REPEAT 


127 


156 


ANK 5 . 


FT 


REPEAT 


159 


188 


ANK 6. 


FT 


REPEAT 


201 


230 


ANK 7 . 


FT 


REPEAT 


231 


260 


ANK 8 . 


FT 


REPEAT 


262 


295 


ANK 9. 


FT 


DOMAIN 


367 


422 


SOCS BOX. 


SQ 


SEQUENCE 


433 AA; 


48317 


MW; 6BCAD1AC2B2BB0 


Query Match 




63.0% 


Score 34; DB 1; 


Best Local 


Similarity 


66.7% 


; Pred. No. 35; 


Matches 


6; Conservative 


0 ; Mismatches 


Qy 


1 


CNSRLHLRC 


9 




Db 


391 


1 1! Ill 
CMGRLRLRC 


399 





Length 433; 
3; Indels 



0 ; Gaps 



0; 



RESULT 15 
NO70_SOYBN 

ID NO7 0_SOYBN STANDARD; PRT; 485 AA. 

AC Q02920; 

DT 01-FEB-1995 (Rel . 31, Created) 

DT 01-FEB-1995 {Rel. 31, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Early nodulin 70. 

OS Glycine max (Soybean) . 

OC Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta ; 
OC Spermatophyta ; Magnoliophyta; eudicotyledons ; core eudicots; Rosidae; 
OC eurosids I; Fabales; Fabaceae; Papilionoideae; Phaseoleae; Glycine. 



OX NCBI_TaxID=3 847; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv. Akisengoku; 

RX MEDLINE=93241143 ; PubMed=7683079 ; 

RA Kouchi H. , Hata S. ; 

RT "Isolation and characterization of novel nodulin cDNAs representing 

RT genes expressed at early stages of soybean nodule development."; 

RL Mol. Gen. Genet. 238:106-119(1993). 

RN [2] 

RP SIMILARITY TO SULFATE PERMEASES. 

RX MEDLINE=94188926; PubMed=8 14 0616 ; 

RA Sandal N.N., Marcker K.A. ; 

RT "Similarities between a soybean nodulin, Neurospora crassa sulphate 

RT permease II and a putative human tumour suppressor."; 

RL Trends Biochem. Sci . 19:19-19(1994). 

CC -!- FUNCTION: POSSIBLE SULFATE TRANSPORTER. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein (Potential). 

CC -!- DEVELOPMENTAL STAGE: EXPRESSED AT EARLY STAGES OF NODULE 
CC DEVELOPMENT . 

CC -!- SIMILARITY: BELONGS TO THE SLC26A FAMILY OF TRANSPORTERS. 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; D13505; BAA02723.1; -. 

DR PIR; S34800; S34800. 

DR InterPro; IPR001902; Sulph_transpt . 

DR Pfam; PF00916; Sul f ate_transp ; 1. 

DR TIGRFAMs; TIGR00815; SulP; 1. 

DR PROSITE; PS0113 0; SLC26A; 1. 

KW Nodulation; Transmembrane; Transport. 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 

DB 1; Length 485; 
10; 

Matches 6; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



TRANSMEM 


74 


94 




POTENTIAL. 


TRANSMEM 


97 


117 




POTENTIAL. 


TRANSMEM 


121 


141 




POTENTIAL . 


TRANSMEM 


152 


172 




POTENTIAL. 


TRANSMEM 


186 


206 




POTENTIAL. 


TRANSMEM 


232 


252 




POTENTIAL. 


TRANSMEM 


262 


282 




POTENTIAL . 


TRANSMEM 


321 


341 




POTENTIAL. 


TRANSMEM 


361 


381 




POTENTIAL. 


TRANSMEM 


400 


420 




POTENTIAL. 


TRANSMEM 


421 


441 




POTENTIAL. 


TRANSMEM 


455 


475 




POTENTIAL. 


SEQUENCE 


485 AA; 


52945 MW; 


3738B6F64 


Query Match 




63 


0%; 


Score 34; 


Best Local Similarity 


75 


0%; 


Pred. No. 4 



Qy 



1 CNSRLHLR 8 



Db 477 CRSRYHLR 4 84 



Search completed: November 13, 2003, 09:46:29 
Job time : 6.15625 sees 

GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



November 13, 2003, 09:31:40 ; Search time 23.7188 Seconds 

(without alignments) 
97.917 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



US-09-228-866-1 
54 

1 CNSRLHLRC 9 
BLOSUM62 

Gapop 10.0 , Gapext 0.5 

830525 seqs, 258052604 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



830525 



Database 



SPTREMBL_23:* 
1 : sp_archea : * 
sp_bacteria : * 
sp_fungi : * 
sp_human : * 
sp_invertebrate : * 
sp_mammal : * 
sp_mhc : * 
sp_organel 1 e : * 
sp_phage : * 
sp_plant : * 
sp_rodent : * 
sp__virus : * 
sp_vertebrate : * 
sp__unclassif ied: * 
sp_jrvirus : * 
spjbacteriap: * 
sp_ar cheap: * 



2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 



Result 
No. 


Score 


Query 

Match Length DB 


ID 


Description 


1 


44 


D 1 
ox. 


R 

<-) 


142 


10 


Q94DJ7 


Q94dj7 oryza sativ 


2 


40 


74 . 


1 


1327 


4 


060859 


060859 homo sapien 


3 


40 


74 . 


1 


1327 


11 


Q9R114 


Q9rll4 mus musculu 


4 


40 


74 . 


1 


1332 


4 


Q8IY17 


Q8iyl7 homo sapien 


5 


39 


72 . 


2 


1348 


5 


Q8I2K9 


Q8i2k9 Plasmodium 


6 


38 


70 . 


4 


220 


10 


Q9LS99 


Q91s99 arabidopsis 


7 


38 


70 . 


4 


309 


10 


Q8S3N1 


Q8s3nl arabidopsis 


8 


38 


70 . 


4 


315 


10 


Q9XGC2 


Q9xgc2 vitis vinif 


9 


38 


70 . 


4 


451 


2 


Q9LAB9 


Q 9 lab 9 pseudomonas 


10 


38 


70 . 


4 


514 


16 


Q9PBA3 


Q9pba3 xylella fas 


11 


38 


7 n 


4 


669 


5 


Q8ISE4 


Q8ise4 drosophila 


12 


38 


70 


4 


v U y 


5 


Q8ISD4 


Q8isd4 drosophila 


13 


38 


70 


4 


675 


5 


Q8ISE3 


Q8ise3 drosophila 


14 


38 


70 


. *± 


67R 


~j 


Q8ISE2 


Q8ise2 drosophila 


15 


38 


7 0 


4 


67S 


j 


Q8ISE1 


Q8isel drosophila 


16 


38 


7 0 


4 


67R 


5 


Q8I0D9 


Q8i0d9 drosophila 


17 


38 


7 0 


_ 4 


675 


5 


Q8I086 


Q81086 drosophila 


18 


38 


70 , 


, 4 


739 


5 


Q9W4Z3 


Q9w4z3 drosophila 


19 


38 


7 0 


4 


782 


5 


Q9NEH9 


Q9neh9 drosophila 


20 


37 


u o . 


c: 

. *j 


464 


10 


Q9SUA5 


Q9sua5 arabidopsis 


21 


37 


6ft 

D O < 


c 
. j 


574 


10 


Q9M8Z9 


Q9m8z9 arabidopsis 


22 


37 


68 , 


. 5 


736 


6 


Q28482 


Q28482 macaca fasc 


23 


37 


68 


. 5 


143 0 


5 


Q23541 


Q23 541 caenorhabdi 


24 


36 


66 


p 7 


146 


10 


064874 


064 874 arabidopsis 


25 


36 


66 


. 7 


199 


4 


Q8NHT9 


Q8nht9 homo sapien 


26 


36 


66 


. 7 


259 


3 


Q9P3T8 


Q9p3t8 schizosacch 


27 


36 


66 


. 7 


332 


13 


Q98U07 


Q98u07 pseudotylos 


28 


36 


66 


. 7 


332 


13 


Q98U08 


Q98u08 platybelone 


29 


36 


D D 


t 7 


333 


13 


Q9DF04 


Q9df 04 strongylura 


30 


36 


66 


. 7 


333 


13 


Q9DF15 


Q9dfl5 platybelone 


31 


36 


66 


. 7 


333 


13 


Q9DF08 


Q9df 08 strongylura 


32 


36 


66 


. 7 


333 


13 


Q9DF10 


Q9dfl0 potamorrhap 


33 


36 


66 


. 7 


333 


13 


Q9DF14 


Q9dfl4 potamorrhap 


34 


36 


66 


. 7 


333 


13 


Q9DF01 


Q9df01 belonion ap 


35 


36 


66 


.7 


333 


13 


Q9DD82 


Q9dd82 potamorrhap 


36 


36 


66 


. 7 


333 


13 


Q9DD51 


Q9dd51 pseudotylos 


37 


36 


66 


.7 


333 


13 


Q9DD50 


Q9dd5 0 belonion di 


38 


36 


66 


.7 


333 


13 


Q9DF03 


Q9df 03 strongylura 


39 


36 


66 


.7 


333 


13 


Q9DF16 


Q9dfl6 strongylura 


40 


36 


66 


.7 


333 


13 


Q9DF12 


Q9df 12 strongylura 


41 


36 


66 


.7 


333 


13 


Q9DD64 


Q9dd64 st r ongy 1 ur a 


42 


36 


66 


.7 


333 


13 


Q9DD35 


Q9dd3 5 strongylura 


43 


36 


66 


.7 


333 


13 


Q9DF13 


Q9dfl3 potamorrhap 


44 


36 


66 


.7 


333 


13 


Q9DF05 


Q9df05 strongylura 


45 


36 


66 


.7 


333 


13 


Q9DF02 


Q9df 02 strongylura 



ALIGNMENTS 



RESULT 1 
Q94DJ7 



ID Q94DJ7 PRELIMINARY; PRT; 142 AA . 

AC Q94DJ7; 

DT 01-DEC-2001 (TrEMBLrel . 19, Created) 

DT 01-DEC-2001 (TrEMBLrel . 19, Last sequence update) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last annotation update) 

DE P0514H03.16 protein. 

GN P0514H03.16. 

OS Oryza sativa (Rice) . 

OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 

OC Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; 

OC Ehrhartoideae; Oryzeae; Oryza. 

OX NCBI_TaxID=4530; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv . Nipponbare ; 

RA Sasaki T., Matsumoto T. , Yamamoto K. ; 

RT "Oryza sativa nipponbare (GA3) genomic DNA, chromosome 1, PAC 

RT clone :P0514H03 . " ; 

RL Submitted (FEB-2001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AP003275; BAB63652.1; 

DR Gramene; Q94DJ7; -. 

SQ SEQUENCE 142 AA; 14769 MW; 50704 184 8FEEA8 97 CRC64; 

Query Match 81.5%; Score 44; DB 10; Length 142; 

Best Local Similarity 77.8%; Pred. No. 0.32; 

Matches 7; Conservative 0; Mismatches 2; Indels 0; Gaps 

Qy 1 CNSRLHLRC 9 

I I I I I I I 
Db 50 CQGRLHLRC 58 

RESULT 2 
060859 

ID 060859 PRELIMINARY; PRT; 1327 AA. 

AC 060859; 

DT 01-AUG-1998 (TrEMBLrel. 07, Created) 

DT 01-AUG-1998 (TrEMBLrel. 07, Last sequence update) 

DT 01-JUN-2002 (TrEMBLrel. 21, Last annotation update) 

DE Neuropathy target esterase. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI JTaxID=9606 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

RX MEDLINE=98244804; PubMed-9576844 ; 

RA Lush M.J., Li Y., Read D.J., Willis A.C., Glynn P.; 

RT "Neuropathy Target Esterase (NTE) and a homologous Drosophila 

RT neurodegeneration-associated mutant protein contain a novel domain 

RT conserved from bacteria to man."; 

RL Biochem. J . 332:1-4(1998). 

DR EMBL; AJ004832; CAA06164.1; 

DR InterPro; IPR000595; cNMP_binding . 

DR InterPro; IPR002641; Patatin. 

DR InterPro; IPR001423; UPF0028. 



DR Pfam; PF00027; cNMP_binding; 3. 

DR Pfam; PF01734; Patatin; 1. 

DR SMART; SM00100; CNMP; 2. 

DR PROSITE; PS50042; CNMP_BINDING_3 ; 3. 

DR PROSITE; PS01237; UPF0028; 1. 

SQ SEQUENCE 1327 AA; 146215 MW; E823248C9B29DD84 CRC64 ; 



Query Match 74 . 1%; 

Best Local Similarity 66.7%; 
Matches 6; Conservative 



Score 40; DB 4; 
Pred. No. 16; 
1; Mismatches 



Length 1327; 
2; Indels 



0; Gaps 



0; 



Qy 

Db 



1 CNSRLHLRC 9 
880 CSGHLHLRC 88! 



RESULT 3 
Q9R114 



ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OX 
RN 
RP 
RC 
RA 
RT 
RL 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
SQ 



PRELIMINARY; 



PRT; 1327 AA. 



Q9R114 
Q9R114; 

01-MAY-2000 (TrEMBLrel. 13, Created) 
01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 
01-JUN-2002 (TrEMBLrel. 21, Last annotation update) 
Neuropathy target esterase homolog. 
NTE . 

Mus musculus (Mouse) . 
Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodentia; 
NCBI_TaxID-10090; 
[1] 

SEQUENCE FROM N.A. 
STRAIN=Balb/c; 

Kretzschmar D., Stempfl T. , Moser 
"Cloning of murine sws/NTE."; 

Submitted (JUL-1999) to the EMBL/GenBank / DDB J databases. 

EMBL; AF173829; AAD51700.1; 

MGD; MGI: 1354723; Nte. 

Int er Pro ; I PRO 0 0 5 9 5 ; cNMP ^binding . 

InterPro; IPR002641; Patatin. 

InterPro; IPR001423; UPF0028 . 

Pfam; PF00027; cNMP_binding; 3. 

Pfam; PF01734; Patatin; 1. 

SMART; SM00100; cNMP; 2. 

PROSITE; PS50042; CNMP_BINDING_3 ; 3. 

PROSITE; PS01237; UPF0028; 1. 

SEQUENCE 1327 AA; 146561 MW; 1824D4F6BA6BEC70 CRC64 ; 



Craniata; Vertebrata; Euteleostomi ; 
Sciurognathi ; Muridae; Murinae; Mus. 



M. 



Query Match 74 . 1%; 

Best Local Similarity 66.7%; 
Matches 6; Conservative 

Qy 1 CNSRLHLRC 9 

Db 88 0 CSGHLHLRC 888 



Score 40; DB 11; Length 1327; 
Pred. No. 16; 
1; Mismatches 2; Indels 



0 ; Gaps 



0; 



RESULT 4 



Q8IY17 

ID Q8IY17 PRELIMINARY; PRT; 1332 AA. 

AC Q8IY17; 

DT Ol-MAR-2003 (TrEMBLrel . 23, Created) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT Ol-MAR-2003 (TrEMBLrel, 23, Last annotation update) 

DE Neuropathy target esterase (Fragment) . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Testis; 

RA Strausberg R.; 

RL Submitted (SEP-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; BC038229; AAH38229.1; -. 

FT NON__TER 1 1 

SQ SEQUENCE 1332 AA; 146818 MW; 7B1C72CAB920B97A CRC64 ; 

Query Match 74.1%; Score 40; DB 4; Length 1332; 

Best Local Similarity 66.7%; Pred. No. 16; 

Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 

Qy 1 CNSRLHLRC 9 

h Mill 
Db 8 85 CSGHLHLRC 8 93 

RESULT 5 
Q8I2K9 

ID Q8I2K9 PRELIMINARY; PRT; 1348 AA. 

AC Q8I2K9; 

DT Ol-MAR-2003 (TrEMBLrel. 23, Created) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Diacylglycerol kinase, putative (EC 2.7.1.107). 

GN PFI1485C. 

OS Plasmodium falciparum (isolate 3D7) . 

OC Eukaryota; Alveolata; Apicomplexa; Haemosporida ; Plasmodium. 

OX NCBIJTaxID-3632 9; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=22255708 ; PubMed=12368867 ; 

RA Hall N . , Pain A., Berriman M. , Churcher C. , Harris B. , Harris D. , 

RA Mungall K. , Bowman S., Atkin R., Baker S., Barron A., Brooks K. , 

RA Buckee CO., Burrows C. , Cherevach I., Chillingworth C. , 

RA Chillingworth T. , Christodoulou Z., Clark L., Clark R. , Corton C, 

RA Cronin A., Davies R. , Davis P., Dear P., Dearden F. , Doggett J., 

RA Feltwell T., Goble A., Goodhead I., Gwilliam R. , Hamlin N. , Hance Z. 

RA Harper D. , Hauser H. , Hornsby T. , Holroyd S., Horrocks P., 

RA Humphray S., Jagels K. , James K.D., Johnson D., Kerhornou A., 

RA Knights A., Konfortov B . , Kyes S., Larke N. , Lawson D. , Lennard N. , 

RA Line A., Maddison M. , Mclean J., Mooney P., Moule S., Murphy L . , 

RA Oliver K. , Ormond D. , Price C. , Quail M.A., Rabbinowitsch E. , 

RA Rajandream M.A. , Rutter S., Rutherford K.M. , Sanders M. , Simmonds M. 

RA Seeger K. , Sharp S., Smith R. , Squares R., Squares S., Stevens K. , 



RA Taylor K. , Tivey A., Unwin L. , Whitehead S., Woodward J., 

RA Sulston J.E., Craig A., Newbold C, Barrell B.G; 

RT "Sequence of Plasmodium falciparum chromosomes 1, 3-9 and 13."; 

RL Nature 419:527-531(2002). 

DR EMBL; AL929358; CAD51983.1; -. 

KW Kinase; Transferase. 

SQ SEQUENCE 1348 AA; 158972 MW; 7523D6F052DB18FD CRC64; 

Query Match 72.2%; Score 39; DB 5; Length 134 8; 

Best Local Similarity 66.7%; Pred. No. 25; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 
Qy 1 CNSRLHLRC 9 

Db 222 CNKYFHLRC 230 



RESULT 6 
Q9LS99 

ID Q9LS99 PRELIMINARY; PRT; 220 AA. 

AC Q9LS99; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Contains similarity to RING zinc finger protein. 

OS Arabidopsis thaliana (Mouse-ear cress) . 

OC Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta ; 

OC Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicots; Rosidae; 

OC eurosids II; Brassicales; Brassicaceae; Arabidopsis. 

OX NCBI_TaxID=3702 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Columbia; 

RA Sato S., Nakamura Y. , Kaneko T., Kato T. , Asamizu E., Tabata S.; 

RL Submitted (APR-1999) to the EMBL/ GenBank/DDB J databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Columbia; 

RX MEDLINE=20277480; PubMed=10819329 ; 

RA Nakamura Y . ; 

RT "Structural analysis of Arabidopsis thaliana chromosome 3. I. Sequence 

RT features of the regions of 4,504,864 bp covered by sixty PI and TAC 

RT clones. "; 

RL DNA Res . 7:131-135(2000). 

RN [3] 

RP SEQUENCE FROM N.A. 

RA Haas B.J., Volfovsky N. , Town CD., Troukhan M . , Alexandrov N., 

RA Feldmann K.A. , Flavell R.B. , White 0., Salzberg S.L.; 

RT "Full-length messenger RNA sequences greatly improve genome 

RT annotat ion . " ; 

RL Genome Biol. 0:0-0(2002). 

RN [4] 

RP SEQUENCE FROM N.A. 

RA Brover V., Troukhan M . , Alexandrov N. , Lu Y.-P., Flavell R. , 

RA Feldmann K. ; 

RT "Full-Length cDNA from Arabidopsis thaliana."; 

RL Submitted (MAR-2002) to the EMBL/ GenBank/DDB J databases. 



CC -!- SIMILARITY: CONTAINS 1 RING-TYPE ZINC FINGER. 

DR EMBL; AB026654; BAB018 04.1; -. 

DR EMBL; AY086917; AAM64481.1; 

DR HSSP; P28990; 1CHC. 

DR InterPro; IPR001841; Znfjring. 

DR Pfam; PF00097; zf-C3HC4; 1. 

DR PROSITE; PS50089; ZF_RING_2 ; 1. 

KW Metal -binding; Zinc; Zinc-finger. 

SQ SEQUENCE 220 AA; 24463 MW; F63F08AEACA4494D CRC64; 

Query Match 70.4%; Score 38; DB 10; Length 220; 

Best Local Similarity 66.7%; Pred. No. 7.7; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 

Qy 1 CNSRLHLRC 9 

II I I I I 
Db 14 9 CNHGFHLRC 157 



RESULT 7 
Q8S3N1 

ID Q8S3N1 PRELIMINARY; PRT; 309 AA. 

AC Q8S3N1; 

DT 01-JUN-2002 (TrEMBLrel . 21, Created) 

DT 01-JUN-2002 (TrEMBLrel. 21, Last sequence update) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last annotation update) 

DE Ring finger E3 ligase SI NATS . 

GN SINAT5 . 

OS Arabidopsis thaliana (Mouse-ear cress) . 

OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta ; 

OC Spermatophyta; Magnoliophyta ; eudicotyledons; core eudicots; Rosidae; 

OC eurosids II; Brassicales; Brassicaceae; Arabidopsis. 

OX NCBI_TaxID=3702; 

RN [1] 

RP SEQUENCE FROM N . A . 

RA Xie Q. , Chua N . H . ; 

RT "SINAT 5, a RING E3 ubiquitin protein ligase, promotes post- 

RT translational degradation of NAC 1 to attenuate auxin signals."; 

RL Submitted (FEB-2 002) to the EMBL / GenBank / DDB J databases. 

DR EMBL; AF480944; AAM11573.1; 

DR InterPro; IPR004162; Sina. 

DR InterPro; IPR001841; Znf_ring. 

DR Pfam; PF03145; Sina; 1. 

DR PROSITE; PS50089; ZF_RING_2; 1. 

KW Ligase. 

SQ SEQUENCE 309 AA; 35008 MW; 3 90 8E2353BB57AAF CRC64 ; 

Query Match 70.4%; Score 38; DB 10; Length 309; 

Best Local Similarity 66.7%; Pred. No. 11; 

Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 

Qy 1 CNSRLHLRC 9 

I Ihl II 
Db 7 0 CKSRVHNRC 78 



RESULT 8 



Q9XGC2 

ID Q9XGC2 PRELIMINARY; PRT; 315 AA. 

AC Q9XGC2; 

DT 01-NOV-1999 (TrEMBLrel . 12, Created) 

DT 01-NOV-1999 (TrEMBLrel . 12, Last sequence update) 

DT Ol-OCT-2002 (TrEMBLrel . 22, Last annotation update) 

DE SINAlp. 

OS Vitis vinifera (Grape) . 

OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta,- 

OC Spermatophyta ; Magnoliophyta; eudi cotyledons ; core eudicots; Vitaceae- 

OC Vitis. 

OX NCBI_TaxID=29760; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv. Optima; 

RA Brehm I., Korfei M. , Preisig-Mueller R. , Kindl H. ; 

RT "A nuclear localised zinc finger protein found in a plant is 

RT homologous to the Drosophila signal tranducing factor seven in 

RT absentia."; 

RL Submitted (NOV-1998) to the EMBL / GenBank / DDB J databases. 

DR EMBL; Y18471; CAB40577.1; 

DR InterPro; IPR004162; Sina. 

DR InterPro; IPR001841; Znfjring. 

DR Pfam; PF03145; Sina; 1. 

DR PROSITE; PS50089; ZF__RING_2; 1. 

SQ SEQUENCE 315 AA; 35838 MW; BC4 9A243 84F6D02 8 CRC64; 

Query Match 70.4%; Score 38; DB 10; Length 315; 

Best Local Similarity 66.7%; Pred. No. 11; 

Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 0; 
Qy 1 CNSRLHLRC 9 

l Ihl ll 

Db 78 CKSRVHNRC 86 

RESULT 9 
Q9LAB9 

ID Q9LAB9 PRELIMINARY; PRT; 451 AA. 

AC Q9LAB9; 

DT 01-OCT-2 000 (TrEMBLrel. 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-JUN-2002 (TrEMBLrel. 21, Last annotation update) 

DE Polyurethanase lipase A. 

GN PULA. 

OS Pseudomonas f luorescens . 

OC Bacteria; Proteobacteria ; Gammaproteobacteria; Pseudomonadales ; 

OC P s eudomona da c ea e ; P s eudomona s . 

OX NCBI_TaxID=2 94; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Ruiz C, Vega R. , Howard G.T.; 

RL Submitted (APR-1999) to the EMBL/ GenBank/DDB J databases. 

DR EMBL; AF144089; AAF66684.1; 

DR InterPro; IPR001343; Hemlysn_Ca_bind . 

DR InterPro; IPR000734; Lipase. 

DR Pfam; PF00353; hemolysinCabind; 3. 



DR PRINTS; PRO 03 13; CABNDNGRPT. 

DR PROSITE; PS00330; HEMOLYSIN^CALCIUM ; 1. 

DR PROSITE; PS00120; LIPASE_SER; 1. 

SQ SEQUENCE 451 AA; 48187 MW; 11 64AAE73BFD0CA3 CRC64 ; 



Query Match 70.4%; 
Best Local Similarity 66.7%; 
Matches 6; Conservative 



Score 38; DB 2; 
Pred. No. 15; 
1; Mismatches 



Length 451; 
2; Indels 



0 ; Gaps 



0; 



Qy 



Db 



1 CNSRLHLRC 9 

h III II 
128 CDHRLHRRC 136 



RESULT 10 
Q9PBA3 

ID Q9PBA3 PRELIMINARY; 

AC Q9PBA3; 

DT 01-OCT-2000 (TrEMBLrel . 15, 

DT 01-OCT-2000 (TrEMBLrel. 15, 

DT 01-OCT-2002 (TrEMBLrel. 22, 

DE Periplasmic protease. 

GN XF2241. 

OS Xylella fastidiosa. 

OC Bacteria ; Proteobacteria ; Gammaproteobacteria ; Xanthomonadales ; 

OC Xanthomonadaceae; Xylella. 

OX NCBI_TaxID=2371; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=9a5c; 

RX MEDLINE=20365717; PubMed=10910347 ; 

RA Simpson A.J.G., Reinach F.C., Arruda P., Abreu F.A., Acencio M. , 

RA Alvarenga R. , Alves L.M.C., Araya J.E., Baia G.S., Baptista C.S., 

RA Barros M.H., Bonaccorsi E.D., Bordin S., Bove J.M. , Briones M.R.S., 

RA Bueno M.R.P., Camargo A. A. , Camargo L.E.A., Carraro D.M. , Carrer H., 

RA Colauto N.B., Colombo C. , Costa F.F., Costa M.C.R., Costa-Neto CM. , 

RA Coutinho L.L., Cristofani M. , Dias-Neto E . , Docena C, El-Dorry H. , 

RA Facincani A. P., Ferreira A.J.S., Ferreira V.C.A., Ferro J.A. , 

RA Fraga J.S., Franca S.C., Franco M.C., Frohme M. , Furlan L.R., 

RA Garnier M., Goldman G.H., Goldman M.H.S., Gomes S.L., Gruber A. , 

RA Ho P.L., Hoheisel J.D., Junqueira M.L. , Kemper E.L., Kitajima J. P., 

RA Krieger J.E., Kuramae E.E., Laigret F. , Lambais M.R., Leite L.C.C., 

RA Lemos E.G.M., Lemos M.V.F., Lopes S.A., Lopes C.R. , Machado J. A., 

RA Machado M.A. , Madeira A.M.B.N., Madeira H.M.F., Marino C.L., 

RA Marques M.V. , Martins E.A.L., Martins E.M.F., Matsukuma A.Y. , 

RA Menck C.F.M., Miracca E.C., Miyaki C.Y. , Monteiro-Vitorello C.B., 

RA Moon D.H., Nagai M.A., Nascimento A.L.T.O., Netto L.E.S., 

RA Nhani A. Jr., Nobrega F.G., Nunes L.R., Oliveira M.A., 

RA de Oliveira M.C., de Oliveira R.C., Palmieri D.A. , Paris A., 

RA Peixoto B.R., Pereira G.A.G. , Pereira H.A. Jr., Pesquero J.B., 

RA Quaggio R.B., Roberto P.G. , Rodrigues V., de Rosa A.J.M., 

RA de Rosa V.E. Jr., de Sa R.G. , Santelli R.V., Sawasaki H.E., 

RA da Silva A.C.R., da Silva A.M., da Silva F.R., Silva W.A. Jr., 

RA da Silveira J.F., Silvestri M.L.Z., Siqueira W.J., de Souza A. A. , 

RA de Souza A. P., Terenzi M.F., Truffi D., Tsai S.M. , Tsuhako M.H., 

RA Vallada H., Van Sluys M.A., Verjovski -Almeida S., Vettore A.L., 

RA Zago M.A., Zatz M. , Meidanis J., Setubal J.C. ; 



PRT; 514 AA. 
Created) 

Last sequence update) 
Last annotation update) 



RT "The genome sequence of the plant pathogen Xylella f astidiosa . " ; 

RL Nature 406:151-159(2000). 

CC -!- SIMILARITY: TO SERINE PROTEASES, TRYPSIN FAMILY . 

DR EMBL; AE004037; AAF85040.1; -. 

DR InterPro; IPR001478; PDZ . 

DR InterPro; IPR001940; Protease2C. 

DR InterPro; I PRO 01254; SerjproteaseJTry . 

DR InterPro; IPR000126; Ser_proteas_V8 . 

DR Pfam; PF00595; PDZ; 1. 

DR Pfam; PF00089; trypsin; 1. 

DR PRINTS; PR00834; PROTEASES2C. 

DR PRINTS; PR00839; V8 PROTEASE . 

DR SMART; SM00228; PDZ; 2. 

DR PROSITE; PS50106; PDZ; 1. 

KW Hydrolase; Serine protease; Complete proteome. 

SQ SEQUENCE 514 AA; 54140 MW; 707C23FD3C82BE4C CRC64 ; 

Query Match 7 0.4%; Score 38; DB 16; Length 514; 

Best Local Similarity 75.0%; Pred. No. 17; 

Matches 6; Conservative 1; Mismatches 1; Indels 0; Gaps 

Qy 2 NSRLHLRC 9 

Ilhl II 
Db 2 NSRIHTRC 9 



RESULT 11 
Q8ISE4 
ID 
AC 
DT 
DT 
DT 
DE 
GN 



Created) 

Last sequence update) 
Last annotation update) 



Q8ISE4 PRELIMINARY; PRT; 669 AA. 

Q8ISE4; 

01-MAR-2003 (TrEMBLrel . 23 , 
01-MAR-2003 (TrEMBLrel . 23, 
01-MAR-2003 (TrEMBLrel. 23 , 
Polehole (Fragment) . 
PH. 

OS Drosophila mauritiana (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 
OC Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 
OC Ephydroidea; Drosophilidae; Drosophila. 
OX NCBI_TaxID-722 6; 
RN [1] 

RP SEQUENCE FROM N . A. 
RC STRAIN=MAU11; 

RA Riley R.M., Jin W. , Gibson G.; 

RT "Contrasting selection pressures on components of the Ras -mediated 

RT signal transduction pathway in Drosophila."; 

RL Submitted (JUL-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AY135030; AAN17540.1; -. 

FT NON_TER 1 1 

SQ SEQUENCE 669 AA; 75788 MW; 7A3E8729F9425927 CRC64 ; 



Query Match 70.4%; 
Best Local Similarity 66.7%; 
Matches 6; Conservative 



Score 38; DB 5; 
Pred. No. 21; 
0; Mismatches 



Length 669; 
3; Indels 



0; Gaps 



Qy 



1 CNSRLHLRC 9 



Db 182 CNFRFHQRC 190 



RESULT 12 
Q8ISD4 



ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OC 
OX 
RN 
RP 
RC 
RA 
RT 
RT 
RL 
DR 
FT 
SQ 



Q8ISD4 
Q8ISD4; 
01-MAR-2003 
01-MAR-2003 
01 -MAR- 2 003 



PRELIMINARY; 



PRT; 



669 AA . 



(TrEMBLrel . 23, Created) 
(TrEMBLrel . 23, Last sequence update) 
(TrEMBLrel. 23, Last annotation update) 
Polehole (Fragment) . 
PH. 

Drosophila simulans (Fruit fly) . 

Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 
Neoptera; Endopterygota ; Diptera; Brachycera; Muscomorpha; 
Ephydroidea ; Drosophil idae ; Drosophila . 
NCBIJTaxID=724 0; 
[1] 

SEQUENCE FROM N.A. 
STRAIN-SIM31; 

Riley R.M., Jin W., Gibson G.; 

"Contrasting selection pressures on components of the Ras-mediated 

signal transduction pathway in Drosophila."; 

Submitted (JUL-2 002) to the EMBL/GenBank/DDBJ databases. 

EMBL; AY135135; AAN17563.1; -. 

NONJTER 1 1 

SEQUENCE 669 AA; 75772 MW; C62 04C078263B03C CRC64 ; 



Query Match 70.4^ 
Best Local Similarity 66.73 
Matches 6; Conservative 

Qy 1 CNSRLHLRC 9 

II I I II 
Db 182 CNFRFHQRC 190 



Score 38; DB 5; Length 669; 
Pred. No. 21; 
0; Mismatches 3; Indels 



0; Gaps 



0; 



RESULT 13 
Q8ISE3 

ID Q8ISE3 PRELIMINARY; PRT; 675 AA. 

AC Q8ISE3; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Polehole (Fragment) . 

GN PH. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophil idae; Drosophila. 

OX NCBI_TaxID=7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=1; 

RA Riley R.M., Jin W. , Gibson G.; 

RT "Contrasting selection pressures on components of the Ras-mediated 

RT signal transduction pathway in Drosophila."; 



RL Submitted (JUL-2002) to the EMBL/GenBank/DDB J databases. 

DR EMBL; AY135031; AAN17541.1; 

FT NON_TER 1 1 

SQ SEQUENCE 675 AA; 76488 MW; 2 944 9E17C54A6 125 CRC64; 

Query Match 70.4%; Score 38; DB 5; Length 675; 

Best Local Similarity 66.7%; Pred. No. 21; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 

Qy 1 CNSRLHLRC 9 

II I I II 
Db 188 CNFRFHQRC 196 



RESULT 
Q8ISE2 



14 



ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OC 
OX 
RN 
RP 
RC 
RA 
RT 
RT 
RL 
DR 
FT 
SQ 



Q8ISE2 
Q8ISE2; 
01-MAR-2003 
01-MAR-2003 
01-MAR-2003 



PRELIMINARY; 



PRT; 



675 AA . 



(TrEMBLrel . 
(TrEMBLrel . 
(TrEMBLrel . 



23 , Created) 

23, Last sequence update) 
23, Last annotation update) 
Polehole (Fragment) . 
PH. 

Drosophila melanogaster (Fruit fly) . 

Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 
Neoptera; Endopterygota ; Diptera; Brachycera; Muscomorpha ; 
Ephydroidea ; Drosophilidae; Drosophila . 
NCBI_TaxID=7227; 
[1] 

SEQUENCE FROM N.A. 
STRAIN=Reids2; 

Riley R.M. , Jin W., Gibson G. ; 

"Contrasting selection pressures on components of the Ras-mediated 

signal transduction pathway in Drosophila ; 

Submitted (JUL-2 0 02) to the EMBL/GenBank/DDB J databases. 

EMBL; AY135038; AAN17548.1; 

NON_TER 1 1 

SEQUENCE 675 AA; 76541 MW; FE2ACB49901B383E CRC64; 



Query Match 7 0.4%; 

Best Local Similarity 66.7%; 
Matches 6; Conservative 

Qy 1 CNSRLHLRC 9 

II I Ml 
Db 188 CNFRFHQRC 196 



Score 38; DB 5; 
Pred. No. 21; 
0; Mismatches 



Length 675; 



3; Indels 



0; Gaps 



RESULT 15 
Q8ISE1 

ID Q8ISE1 PRELIMINARY 
AC Q8ISE1; 

DT 01-MAR-2003 (TrEMBLrel. 
DT 01-MAR-2003 (TrEMBLrel. 
DT 01-MAR-2003 (TrEMBLrel . 
DE Polehole (Fragment) . 
GN PH. 



PRT; 675 AA. 
23, Created) 

23, Last sequence update) 
23, Last annotation update) 



OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota ; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID=7227 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=5-17-88b#5; 

RA Riley R.M. , Jin W., Gibson G. ; 

RT "Contrasting selection pressures on components of the Ras-mediated 

RT signal transduction pathway in Drosophila."; 

RL Submitted (JUL-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL ; AY135052; AAN17562.1; -. 

FT NON_TER 1 1 

SQ SEQUENCE 675 AA; 76502 MW; 0B2A53 8ED51B876E CRC64 ; 



Query Match 70.4%; Score 38; DB 5; Length 675; 

Best Local Similarity 66.7%; Pred. No. 21; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 

Qy 1 CNSRLHLRC 9 

Ml I II 
Db 188 CNFRFHQRC 196 



Search completed: November 13, 2003, 09:50:55 
Job time : 24.7188 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2 003 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 



November 13, 2003, 09:39:50 ; Search time 10.6875 Seconds 

(without alignments) 
35.630 Million cell updates/sec 

US-09-228-866-2 
67 

1 CENWWGDVC 9 



Scoring table: BL0SUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 328717 seqs, 42310858 residues 

Total number of hits satisfying chosen parameters: 328717 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : Issued_Patents_AA: * 

1 : /cgn2_6/ptodata/l/iaa/5A_COMB.pep: * 

2 : /cgn2_6/ptodata/l/iaa/5B_COMB.pep: * 

3 : /cgn2_6/ptodata/l/iaa/6A_COMB.pep: * 

4 : /cgn2_6/ptodata/l/iaa/6B_COMB . pep : * 

5 : /cgn2_6/ptodata/l/iaa/PCTUS_COMB.pep: * 

6 : /cgn2_6/ptodata/l/iaa/backf ilesl.pep:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

o, 
"5 

Result Query 

No. Score Match Length DB ID Description 
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-526 


-710-2 


Sequence 


2, Appli 
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100 
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US- 
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-862 


-855-2 


Sequence 


2, Appli 
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67 


100 
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-226 


-985-2 


Sequence 


2, Appli 
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67 


100 


0 
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09 


-227 


-906-2 


Sequence 


2, Appli 
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41 


61 


2 


206 
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us- 


09 


-071 


-035-272 


Sequence 


272, App 
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41 


61 


2 


207 
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us- 


09 


-071 


-035-270 


Sequence 


270, App 
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40 


59 


7 


266 
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us- 


09 


-252 


-991A-17646 


Sequence 


17646, A 
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40 


59 


7 


372 


4 


us- 


09 


-092 


-315-13 


Sequence 


13, Appl 


9 


40 


59 


7 


425 


4 


us- 


09 


-092 


-315-6 


Sequence 


6, Appli 


10 


40 


59 


7 


425 


4 


us- 


09 


-733 


-524A-6 


Sequence 


6, Appli 


11 


40 


59. 


7 


454 


4 


us- 


09 


-092 


-315-8 


Sequence 


8, Appli 
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RQ 

3 -J 


7 




A 

4 


T TO A Q TOO tro/i7\ o 


Sequence 


8, Appli 


13 




R 9 


7 


A £ A 
4 O 4 


4 


TTO AG A O O H r t 

Ub-uy-uy2-3ib-i 


Sequence 


1, Appli 


14 




R9 

3 -7 


7 


4 £4 


A 
4 


T TC AO TOO C O /I 7\ T 

Ub-Uy- /33-b^4A-l 


Sequence 


1, Appli 


15 


4 o 


R Q 

3 z> 


7 


4 / O 


4 


Ub-(jy-0y2-3l5-5 


Sequence 


5, Appli 


16 


4 0 


RQ 

3 Zf 


7 


4 7 £ 
*± / D 


A 
4 


T TC AO TOO trn/l7\ cr 

Ub-Uy- /33-3^4A-b 


Sequence 


5 , Appl i 


17 


4 0 


R 9 
3 y 


7 


4 7 fl 


4 


T TC AO ADO T 1 r r 7 

ub-uy-uy^-3ib - / 


Sequence 


7, Appli 


18 


4 0 


RQ 

3 -7 
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4 7 Q 
*± / O 


4 


TTC AG *~7 O "D CO /I 7\ *~7 

Ub - Uy - /3 3 -b^34A- / 


Sequence 


1, Appli 


19 


40 


R 9 


7 


4ft £ 

TOO 


4 


TTC-AQ AQO ^ 1 C O 


Sequence 


2, Appli 


2 0 


4 0 


RQ 
o y 


7 


4 fl £ 
4 O O 


A 

4 


T TC AO TOO CTO/i7\ O 

Ub-Uy- /33-D^4A-2 


Sequence 


2, Appli 


21 


4 n 


R Q 


7 


Dju 


4 


T TO AQ TOO /ICOA yi n ^ 

ub- uy - iy y -4 52A-4 8 y 


Sequence 


489, App 


22 


4 D 


R Q 


7 


1 Q Q 

lyy 


0 
3 


TTC AC1 TOA /~ 

Ub-uy-lo(J-4jy-b 


Sequence 


6, Appli 


23 


4 0 


R Q 
3 y 


7 


Q £ fl 
-/Do 


-J 
3 


T TC AQ TOA /IIO ") 

Ub-Uy-loU-43y-3 


Sequence 


3, Appli 


24 


4 P. 


R Q 
3 y 


7 


_7 D O 


O 
3 


TTO AQ 10A <n n yi 

ub-uy-io(j-43y-4 


Sequence 


4, Appli 


25 


4 D 


R Q 
3 y 


7 


iUlu 


o 
3 


TTO AQ "IOA /i t a n 

ub-uy-ioU-43y-o 


Sequence 


8, Appli 


26 


4 0 


R Q 

3 _? 


7 




o 

3 


T TO AQ OCTO CZ O C O 

Ub -Uy-333-b85-2 


Sequence 


2, Appli 


27 


4 n 


RQ 

3 


7 


_L _L _L Zj 


O 
3 


T TO AQ OCO COC O 

ub-uy-333-bob-3 


Sequence 


3, Appli 


28 


3 9 


Rft 


o 
z< 


DUO 


-3 
3 


TTC AQ A HO 0/1A7\ 

Ub-Uo-4 / A - Z4 UA - 1 b 


Sequence 


16, Appl 


29 


39 


58 


2 


ft ? ? 


A 


T TC - DQ-^nC c ci Ti 

ub-uy-zub-obl-zl 


Sequence 


21, Appl 


30 


^ 9 

3 .7 


Rft 

3 O 




O 33 


4 


TTO AQ OA£T CTCTI lO 

Ub-Uy-z0b-b3l-l3 


Sequence 


13, Appl 


31 


^ 9 

3 


Rfl 

3 o 


9 


0 7 9 


3 


TTO C\1 O C C >100 TO 

Ub-U / -y3b-4oj-lz 


Sequence 


12, Appl 


32 


J o 


R£ 

3 D 


7 


fl £ 


O 
3 


T TC AQ OQyi "IT! CTO 

ub-uo-ay4-i / j- 3^ 


Sequence 


52, Appl 


33 


^fl. 


R£ 

3 O 


7 


ft £ 
O D 


O 
3 


TTC Afl QQ/1 110 CO 

Ub-Uo-oy4-l / 3- b3 


Sequence 


53 , Appl 


34 


38 


R6 

3 O « 


7 


fl £ 


O 
3 


TTC AQ OQQ 1Q0 CO 

ub-uy-jyo-iyj-bz 


Sequence 


52, Appl 


7 R 

J 3 


7 fl 
O 0 


C £" 


7 

t 


o b 


3 


TTO AO OOO "ir»l ^"i 

US- 09-398 -193 -53 


Sequence 


53, Appl 




^ fl 

3 o 


R£ 
3 D . 


7 


9 0 Q 

<£3 y 


A 

4 


T TO AQ OCO OA1T\ onn A;i 

Ub-Uy-^ibz-yylA-2 82 04 


Sequence 


28204, A 


-j / 


1 fl 


3D . 


7 


^ y l 


4 


US- 09-252 -99 1A-28294 


Sequence 


28294, A 


38 


? fl 

3 o 


R£ 
3 D . 


7 




4 


TTC AQ /IOA lorTi i 

Ub-Uy-4z(J- /odA-3 


Sequence 


3, Appli 


39 


38 


56 . 


7 


437 


4 


US - 09-99fi-94^-?RR 




O C C 7\ -r-*-r-^ 

3 3 3, App 


40 


38 


56. 


7 


534 


4 


US-09-199-637A-67 


Sequence 


67, Appl 


41 


38 


56. 


7 


534 


4 


US- 09 -2 52- 9 91A-2 65 66 


Sequence 


26566, A 


42 


38 


56. 


7 


1248 


3 


US-08-726-214-16 


Sequence 


16, Appl 


43 


37 


55. 


2 


121 


4 


US-09-252 -991A-18698 


Sequence 


18698, A 


44 


37 


55. 


2 


180 


3 


US-09-187-331-5 


Sequence 


5, Appli 


45 


37 


55. 


2 


180 


4 


US-09-470-946-5 


Sequence 


5, Appli 



ALIGNMENTS 



RESULT 1 
US-08-526-710-2 

; Sequence 2, Application US/08526710 
; Patent No. 5622699 
; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell and Flores 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 

STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 



2 



MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patent In Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/526 , 710 

FILING DATE: ll-SEP-1995 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 

REFERENCE/DOCKET NUMBER: P-LJ 1779 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 9 amino acids 

TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-08-526-710-2 



Query Match 100.0%; Score 67; DB 1; Length 9; 

Best Local Similarity 100.0%; Pred. No. 2,5e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CENWWGDVC 9 

MINIMI 

Db 1 CENWWGDVC 9 



RESULT 2 
US-08-862-855-2 

; Sequence 2, Application US/08862855 
; Patent No. 6068829 
; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
; NUMBER OF SEQUENCES: 44 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 
; STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 

STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS -DOS 

SOFTWARE: Patent In Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/862 , 855 

FILING DATE: 



3 



CLASSIFICATION: 424 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 

FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,273 

FILING DATE: 10-MAR-1997 
ATTORNEY/ AGENT INFORMATION: 

NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 

REFERENCE/DOCKET NUMBER: P-LJ 2 621 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 9 amino acids 

TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-08-862-855-2 



Query Match 100.0%; Score 67; DB 3; Length 9; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 CENWWGDVC 9 

Illllllll 

Db 1 CENWWGDVC 9 



RESULT 3 
US-09-226-985-2 

; Sequence 2, Application US/09226985 
; Patent No. 6296832 
; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Molecules That Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 

STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patent In Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/22 6 , 98 5 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 



4 



APPLICATION NUMBER: US 08/526,710 

FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,273 

FILING DATE: 10-MAR-1997 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/862,855 

FILING DATE: 23-MAY-1997 
ATTORNEY/AGENT INFORMATION: 

NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 

REFERENCE/DOCKET NUMBER: P-LJ 3423 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 9 amino acids 

TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-09-226-985-2 



Query Match 100.0%; Score 67; DB 3; Length 9; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 
Matches 9; Conservative 0; Mismatches 0; Indels 



0 ; Gaps 



0; 



Qy 

Db 



1 CENWWGDVC 9 

Illllllll 
1 CENWWGDVC 9 



RESULT 4 
US-09-227-906-2 

; Sequence 2, Application US/09227906 
; Patent No. 6306365 

GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 

STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patent In Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/227 , 906 

FILING DATE: 



5 



CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 
FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,273 
FILING DATE: 10-MAR-1997 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/862,855 
FILING DATE: 23-MAY-1997 
ATTORNEY/AGENT INFORMATION: 
NAME: Campbell, Cathryn A. 
REGISTRATION NUMBER: 31,815 
REFERENCE/DOCKET NUMBER: P-LJ 3424 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (619) 535-9001 
TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 9 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-09-227-906-2 

Query Match 100.0%; Score 67; DB 4; Length 9; 

Best Local Similarity 100.0%; Pred. No, 2.5e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CENWWGDVC 9 

MINIMI 

Db 1 CENWWGDVC 9 



RESULT 5 

US-09-071-035-272 

; Sequence 272, Application US/09071035 
; Patent No. 6448043 
; GENERAL INFORMATION: 

APPLICANT: Gil H. Choi 

TITLE OF INVENTION: Enterococcus faecal is Polynucleotides and Polypeptides 
NUMBER OF SEQUENCES: 496 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Human Genome Sciences, Inc. 

STREET: 9410 Key West Avenue 

CITY: Rockville 
; STATE: Maryland 

COUNTRY : USA 

ZIP: 20850 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage 

COMPUTER: HP Vectra 486/33 

OPERATING SYSTEM: MSDOS version 6.2 

SOFTWARE: ASCII Text 
CURRENT APPLICATION DATA : 

APPLICATION NUMBER: US/ 09/ 071 , 035 

FILING DATE: 



6 



CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 

FILING DATE: 
ATTORNEY/ AGENT INFORMATION: 

NAME: A. Anders Brookes 

REGISTRATION NUMBER : 3 6,373 

REFERENCE/DOCKET NUMBER: PB3 69P2 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (301) 309-8504 

TELEFAX: (301) 309-8512 
; INFORMATION FOR SEQ ID NO: 272: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 206 amino acids 

TYPE: amino acid 

STRANDEDNESS : single 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-09-071-035-272 



Query Match 61.2%; Score 41; DB 4; Length 2 06; 

Best Local Similarity 83.3%; Pred. No. 32; 

Matches 5; Conservative 1; Mismatches 0; Indels 0; Gaps 0; 

Qy 2 ENWWGD 7 

HUM 

Db 164 KNWWGD 169 



RESULT 6 

US-09-071-035-270 

; Sequence 270, Application US/09071035 
; Patent No. 6448043 
; GENERAL INFORMATION: 

APPLICANT: Gil H. Choi 

TITLE OF INVENTION: Enterococcus faecal is Polynucleotides and Polypeptides 
NUMBER OF SEQUENCES: 4 96 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Human Genome Sciences, Inc. 

STREET: 9410 Key West Avenue 

CITY: Rockville 

STATE: Maryland 

COUNTRY : USA 

ZIP : 20850 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage 

COMPUTER: HP Vectra 486/33 

OPERATING SYSTEM: MSDOS version 6.2 

SOFTWARE: ASCII Text 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/071 , 035 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 

FILING DATE: 
ATTORNEY/AGENT INFORMATION: 



7 



; NAME: A. Anders Brookes 

REGISTRATION NUMBER: 36,373 
REFERENCE/DOCKET NUMBER: PB369P2 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (301) 309-8504 
TELEFAX: (301) 309-8512 
; INFORMATION FOR SEQ ID NO: 270: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 2 07 amino acids 
TYPE: amino acid 
STRANDEDNESS : s ingl e 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-09-071-035-270 

Query Match 61.2%; Score 41; DB 4; Length 207; 

Best Local Similarity 83.3%; Pred. No. 32; 

Matches 5; Conservative 1; Mismatches 0; Indels 0; Gaps 

Qy 2 ENWWGD 7 

:|IMI 

Db 165 KNWWGD 170 



RESULT 7 

US-09-252-991A-17646 

; Sequence 17646, Application US/09252991A 
; Patent No. 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 

; CURRENT APPLICATION NUMBER: US/09/252 , 9 91A 

; CURRENT FILING DATE: 1999-02-18 

; PRIOR APPLICATION NUMBER: US 60/074,788 

; PRIOR FILING DATE: 1998-02-18 

; PRIOR APPLICATION NUMBER: US 60/094,190 

; PRIOR FILING DATE: 1998-07-27 

; NUMBER OF SEQ ID NOS : 33142 

; SEQ ID NO 17646 

LENGTH: 266 

TYPE: PRT 
; ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-17646 

Query Match 59.7%; Score 40; DB 4; Length 266; 

Best Local Similarity 44.4%; Pred. No. 58; 

Matches 4; Conservative 1; Mismatches 4; Indels 0; Gaps 
Qy 1 CENWWGDVC 9 

h II I 

Db 55 CQAWWSQAC 63 



RESULT 8 



US-09-092-315-13 

; Sequence 13, Application US/09092315 

; Patent No. 6399337 

; GENERAL INFORMATION: 

; APPLICANT: Taylor, Diane E. 

; APPLICANT: Ge, Zhongming 

; TITLE OF INVENTION: ALPHA- 1 , 3 -FUCOSYLTRANF ERASE 

; FILE REFERENCE: 07254/049001 

; CURRENT APPLICATION NUMBER: US/09/092,315 

CURRENT FILING DATE: 1998-06-05 
; EARLIER APPLICATION NUMBER: US 60/048,857 
; EARLIER FILING DATE: 1997-06-06 
; NUMBER OF SEQ ID NOS : 22 

SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 13 
LENGTH: 372 
TYPE: PRT 

ORGANISM: Helicobacter pylori 
US-09-092-315-13 



Query Match 59.7%; Score 40; DB 4; Length 372; 

Best Local Similarity 100.0%; Pred. No. 81; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 3 NWWGD 7 

Mill 

Db 32 NWWGD 36 



RESULT 9 
US-09-092-315-6 

; Sequence 6, Application US/09092315 

; Patent No. 6399337 

; GENERAL INFORMATION: 

; APPLICANT: Taylor, Diane E. 

; APPLICANT: Ge, Zhongming 

; TITLE OF INVENTION: ALPHA- 1, 3 -FUCOSYLTRANFERASE 

; FILE REFERENCE: 07254/049001 

; CURRENT APPLICATION NUMBER: US/ 0 9/ 092 , 3 15 

; CURRENT FILING DATE: 1998-06-05 

; EARLIER APPLICATION NUMBER: US 60/048,857 

; EARLIER FILING DATE: 1997-06-06 

; NUMBER OF SEQ ID NOS: 22 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 6 

LENGTH: 425 

TYPE : PRT 

ORGANISM: Helicobacter pylori 
US-09-092-315-6 

Query Match 59.7%; Score 40; DB 4; Length 425; 

Best Local Similarity 100.0%; Pred. No. 93; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 3 NWWGD 7 

Mill 

Db 33 NWWGD 37 



9 



RESULT 10 
US-09-733-524A-6 

; Sequence 6, Application US/09733524A 

; Patent No. 6534298 

; GENERAL INFORMATION: 

; APPLICANT : Taylor, Diane E. 

; APPLICANT: Ge, Zhongming 

; TITLE OF INVENTION: NUCLEIC ACIDS ENCODING ALPHA-1,3 

; TITLE OF INVENTION: FUCOS YLTRANS FERAS ES AND EXPRESSION SYSTEMS FOR MAKING 
AND 

; TITLE OF INVENTION: EXPRESSING THEM (amended) 
; FILE REFERENCE: 07254-049002 

; CURRENT APPLICATION NUMBER: US/09/733 , 524A 

; CURRENT FILING DATE: 2000-12-07 

; PRIOR APPLICATION NUMBER: US 09/092,315 

; PRIOR FILING DATE: 1998-06-05 

; PRIOR APPLICATION NUMBER: US 60/048,857 

; PRIOR FILING DATE: 1997-06-06 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 6 

LENGTH: 425 

TYPE : PRT 

ORGANISM: Helicobacter pylori 
US-09-733-524A-6 



Query Match 5 9.7%; Score 40; DB 4; Length 42 5; 

Best Local Similarity 100.0%; Pred. No. 93; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 3 NWWGD 7 

Mill 

Db 33 NWWGD 37 



RESULT 11 
US-09-092-315-8 

; Sequence 8, Application US/09092315 

; Patent No. 6399337 

; GENERAL INFORMATION: 

; APPLICANT: Taylor, Diane E. 

; APPLICANT: Ge, Zhongming 

; TITLE OF INVENTION: ALPHA- 1, 3 -FUCOS YLTRANF ERASE 

; FILE REFERENCE: 07254/049001 

; CURRENT APPLICATION NUMBER: US/09/092 , 315 

; CURRENT FILING DATE: 1998-06-05 

; EARLIER APPLICATION NUMBER: US 60/048,857 

; EARLIER FILING DATE: 1997-06-06 

; NUMBER OF SEQ ID NOS: 22 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 8 

LENGTH: 454 

TYPE: PRT 

ORGANISM: Helicobacter pylori 
US-09-092-315-8 



10 



Query Match 59.7%; Score 40; DB 4; Length 454; 

Best Local Similarity 100.0%; Pred. No. 99; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 3 NWWGD 7 

Mill 

Db 32 NWWGD 36 



RESULT 12 
US-09-733-524A-8 

; Sequence 8, Application US/09733524A 

; Patent No. 6534298 

; GENERAL INFORMATION: 

; APPLICANT : Taylor, Diane E. 

; APPLICANT: Ge, Zhongming 

; TITLE OF INVENTION: NUCLEIC ACIDS ENCODING ALPHA- 1,3 

; TITLE OF INVENTION: FUCOSYLTRANSF ERASES AND EXPRESSION SYSTEMS FOR MAKING 
AND 

; TITLE OF INVENTION: EXPRESSING THEM (amended) 
; FILE REFERENCE: 07254-049002 

; CURRENT APPLICATION NUMBER: US/09/733 , 524A 

; CURRENT FILING DATE: 2000-12-07 

; PRIOR APPLICATION NUMBER: US 09/092,315 

; PRIOR FILING DATE: 1998-06-05 

; PRIOR APPLICATION NUMBER: US 60/048,857 

PRIOR FILING DATE: 1997-06-06 

NUMBER OF SEQ ID NOS : 27 
; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 8 

LENGTH: 454 
TYPE : PRT 

ORGANISM: Helicobacter pylori 
US-09-733-524A-8 



Query Match 59.7%; Score 40; DB 4; Length 454; 

Best Local Similarity 100.0%; Pred. No, 99; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 3 NWWGD 7 

Mill 

Db 32 NWWGD 36 



RESULT 13 
US-09-092-315-1 

; Sequence 1, Application US/09092315 

; Patent No. 6399337 

; GENERAL INFORMATION: 

; APPLICANT: Taylor, Diane E. 

; APPLICANT: Ge, Zhongming 

; TITLE OF INVENTION: ALPHA- 1 , 3 -FUCOSYLTRANFERASE 

; FILE REFERENCE: 07254/049001 

; CURRENT APPLICATION NUMBER: US/09/092,315 

; CURRENT FILING DATE: 1998-06-05 

; EARLIER APPLICATION NUMBER: US 60/048,857 



11 



; EARLIER FILING DATE: 1997-06-06 
; NUMBER OF SEQ ID NOS : 22 

SOFTWARE; FastSEQ for Windows Version 3.0 
; SEQ ID NO 1 

LENGTH: 4 64 

TYPE: PRT 

ORGANISM: Helicobacter pylori 
US-09-092-315-1 

Query Match 59.7%; Score 4 0; DB 4; Length 4 64; 

Best Local Similarity 100.0%; Pred. No. le+02; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0 
Qy 3 NWWGD 7 

Mill 

Db 32 NWWGD 3 6 



RESULT 14 
US-09-733-524A-1 

; Sequence 1, Application US/09733524A 
; Patent No. 6534298 
; GENERAL INFORMATION: 

APPLICANT: Taylor, Diane E. 
; APPLICANT: Ge, Zhongming 

; TITLE OF INVENTION: NUCLEIC ACIDS ENCODING ALPHA- 1 , 3 

; TITLE OF INVENTION: FUCOSYLTRANSF ERASES AND EXPRESSION SYSTEMS FOR MAKING 
AND 

; TITLE OF INVENTION: EXPRESSING THEM (amended) 
; FILE REFERENCE: 07254-049002 

; CURRENT APPLICATION NUMBER: US/ 09/733 , 524A 

; CURRENT FILING DATE: 2000-12-07 

; PRIOR APPLICATION NUMBER: US 09/092,315 

; PRIOR FILING DATE: 1998-06-05 

; PRIOR APPLICATION NUMBER: US 60/048,857 

; PRIOR FILING DATE: 1997-06-06 

; NUMBER OF SEQ ID NOS: 27 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 1 

LENGTH: 4 64 

TYPE : PRT 

ORGANISM: Helicobacter pylori 
US-09-733-524A-1 

Query Match 59.7%; Score 40; DB 4; Length 464; 

Best Local Similarity 100.0%; Pred. No. le+02; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 3 NWWGD 7 

Mill 

Db 32 NWWGD 36 



RESULT 15 
US-09-092-315-5 

; Sequence 5, Application US/09092315 
; Patent No. 6399337 



12 



; GENERAL INFORMATION: 

; APPLICANT: Taylor, Diane E. 

; APPLICANT: Ge, Zhongming 

; TITLE OF INVENTION: ALPHA- 1 , 3 -FUCOSYLTRANFERASE 

; FILE REFERENCE: 07254/049001 

; CURRENT APPLICATION NUMBER: US/09/0 92,315 

; CURRENT FILING DATE: 1998-06-05 

; EARLIER APPLICATION NUMBER: US 60/048,857 

; EARLIER FILING DATE: 1997-06-06 

; NUMBER OF SEQ ID NOS : 22 

SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 5 

LENGTH: 476 

TYPE : PRT 

ORGANISM: Helicobacter pylori 
US-09-092-315-5 

Query Match 59.7%; Score 40; DB 4; Length 476; 

Best Local Similarity 100.0%; Pred. No. le+02; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 3 NWWGD 7 

Mill 

Db 33 NWWGD 37 



Search completed: November 13, 2003, 09:54:55 
Job time : 10.6875 sees 
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GenCore version 5.1,6 
Copyright (c) 19 93 - 2 0 03 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



November 13, 2003, 09:31:40 ; Search time 30.2812 Seconds 

[without alignments) 
47.176 Million cell updates/sec 

US-09-228-866-2 
67 

1 CENWWGDVC 9 
BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 1107863 seqs, 158726573 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



1107863 



Database : 



A_Geneseq_19Jun03 : * 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 



/ S I DS 1 /gcgda t a / genes eq/gene s eqp - embl /AA1 9 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA19 
/ S I DS 1 / gcgda t a/ genes eq/gene s eqp - embl /AA1 9 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA19 
/ S I DS 1 / gcgda t a / genes eq/gene s eqp - embl /AA 1 9 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA19 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA19 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA19 
/SIDSl/gcgdata/geneseq/geneseqp~embl/AA19 
/ S I DS 1 /gcgda t a /genes eq/ genes eqp - embl / AA1 
/ SIDS 1/gcgdata/geneseq/geneseqp- embl /AA1 
/ S I DS 1 /gcgda t a /genes eq/gene s eqp - embl /AA1 
/SIDS 1/gcgdata/geneseq/geneseqp- embl /AA1 
/ SIDS 1/gcgdata/geneseq/geneseqp -embl /AA1 
/ SIDSl/gcgdata/geneseq/geneseqp-embl/AAl 
/ SIDS 1/gcgdata/geneseq/geneseqp -embl /AA1 
/SIDSl/gcgda ta/geneseq/geneseqp- embl /AA1 
/S IDS 1/gcgdata/geneseq/geneseqp -embl /AA1 
/S IDS 1 /gcgda ta/geneseq/geneseqp - embl /AA1 
/S IDS 1/gcgdata/geneseq/geneseqp -embl /AA1 
/ SIDS 1/gcgdata/geneseq/geneseqp- embl /AA2 
/ S I DS 1 /gcgda t a/genes eq/ geneseqp - embl / AA2 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA2 
/SIDS1 /gcgda ta/geneseq/geneseqp -embl /AA2 



8 0 . DAT : * 

81. DAT:* 

82 . DAT: * 

83. DAT:* 

84 . DAT: * 

85. DAT:* 

86. DAT:* 

87 . DAT: * 

88. DAT:* 

989. DAT : * 

990. DAT : * 

991. DAT : * 

992. DAT:* 

993 . DAT:* 

994 . DAT:* 

995. DAT:* 

996. DAT : * 

997. DAT:* 

9 98. DAT : * 
999. DAT : * 

000. DAT:* 

001. DAT:* 

002. DAT : * 

003 . DAT: * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 



and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


67 


100. 


0 


9 


18 


AAW13416 


brain nommg pepm 


2 


67 


100. 


0 


9 


21 


AAB07388 


Brain homing pepti 


3 


67 


100. 


0 


9 


22 


AAE11794 


Phage peptide #2 t 


4 


67 


100. 


0 


9 


23 


AAU10705 


Brain homing pepti 


5 


67 


100 . 


0 


9 


24 


ABU5 9530 


Brain receptor tar 


6 


47 


70. 


1 


83 


22 


ABG25219 


Novel human diagno 


7 


47 


70. 


1 


252 


22 


ABB59632 


Drosophila melanog 


8 


44 


65 . 


7 


61 


22 


AAU60753 


Propionibacteriutn 


9 


43 


64. 


2 


1050 


24 


AAE32727 


kiaauujz protein. 


10 


43 


64 . 


2 


1050 


24 


AAE32731 


tjTj 1 'D Q i^vnt" ni n / ~t" 

rinjKL-3 piocein \va.L 


11 


43 


64. 


2 


1054 


24 


AAE32732 


prouem ivar 


12 


42 


62 . 


7 


79 


22 


AAU59363 


Propionibacteriutn 


13 


42 


62. 


7 


755 


22 


AAB94435 


Human protein secju 


14 


42 


62 . 


7 


755 


23 


ABP69413 


hiuman polypeptide 


15 


42 


62. 


7 


1088 


23 


ABJ05495 


Human breast cance 


16 


42 


62 . 


7 


1088 


23 


ABJ01044 


Human breast speci 


17 


41 


61. 


.2 


40 


22 


AAM85611 


Huma n i mmun e / ha ema 


18 


41 


61 . 


.2 


45 


22 


AAM06485 


Human toetai proue 


19 


41 


61. 


,2 


206 


20 


AAY00145 


Enterococcus faeca 


20 


41 


61. 


.2 


206 


23 


ABP43364 


H EdcCallb iir / 1 d 


21 


41 


61 . 


. 2 


206 


24 


ABU13643 


hjnuerococcus idctd 


22 


41 


61, 


.2 


207 


20 


AAY00144 


Enterococcus faeca 


23 


41 


61, 


.2 


207 


23 


ABP43363 


E faecalis EF071 p 


24 


41 


61, 


.2 


207 


24 


ABU13642 


Enterococcus faeca 


25 


40.5 


60, 


.4 


1572 


18 


AAW27160 


Mouse receptor ME2 


26 


40.5 


60 


.4 


2707 


18 


AAW27161 


Mouse receptor ME2 


27 


40 


59 


.7 


20 


22 


ABB4 5238 


Rabbit albumin-bin 


28 


40 


59 


. 7 


146 


23 


ABU5172 0 


Helicobacter pylor 


29 


40 


59 


.7 


187 


22 


ABG19166 


Novel human diagno 


30 


40 


59 


.7 


189 


22 


ABG04154 


Novel human diagno 


31 


40 


59 


. 7 


206 


22 


ABG04155 


Novel numan aiagno 


32 


40 


59 


. 7 


212 


22 


AAG8 98 92 


U yiuudniituiu £ji.LJL.e 


33 


40 


59 


.7 


221 


22 


ABG04153 


Novel human diagno 


34 


40 


59 


. 7 


243 


23 


ABG96331 


Human ovarian cane 


35 


40 


59 


.7 


243 


23 


ABG92083 


Human receptors an 


36 


40 


59 


.7 


244 


22 


AAO06091 


Human polypeptide 


37 


40 


59 


.7 


248 


22 


ABG19167 


Novel human diagno 


38 


40 


59 


.7 


250 


22 


AAE09454 


Human sbg72825FOLA 


39 


40 


59 


.7 


257 


23 


ABG96330 


Human ovarian cane 


40 


40 


59 


. 7 


270 


23 


ABP41366 


Human ovarian anti 


41 


40 


59 


.7 


368 


22 


ABG19677 


Novel human diagno 


42 


40 


59 


.7 


418 


23 


ABU52257 


Helicobacter pylor 


43 


40 


59 


.7 


424 


23 


ABG3 08 8 5 


H. pylori alphal,3 


44 


40 


59 


.7 


454 


23 


ABG30887 


H. pylori alphal,3 


45 


40 


59 


.7 


455 


21 


AAY54499 


Mouse liver angiop 



ALIGNMENTS 



RESULT 1 
AAW13416 

ID AAW13416 standard; Peptide; 9 AA. 
XX 

AC AAW13416; 
XX 

DT 15-JAN-1998 (first entry) 
XX 

DE Brain homing peptide. 
XX 

KW Brain homing peptide; in vivo panning; screening; phage display; 

KW drug delivery. 

XX 

OS Synthetic. 
XX 

PN WO9710507-A1 . 
XX 

PD 20-MAR-1997. 
XX 

PF 10-SEP-1996; 96WO-US14600 . 
XX 

PR ll-SEP-1995; 95US- 052 6710 . 

PR ll-SEP-1995; 95US - 0526708 . 
XX 

PA (LJOL-) LA JOLLA CANCER RES FOUND. 
XX 

PI Pasqualini R, Ruoslahti E; 
XX 

DR WPI; 1997-202359/18. 
XX 

PT Obtaining compound that homes to selected organ or tissue - by in 

PT vivo panning method, specifically to identify brain, kidney, 

PT angiogenic vasculature or tumour tissue homing peptide (s) 
XX 

PS Claim 14; Page 67; 75pp; English. 
XX 

CC This synthetic peptide is a claimed example of a brain-homing 

CC peptide that was identified using a novel method for obtaining 

CC molecules that home to a selected organ or tissue. This in vivo 

CC panning method typically involves administering a phage display 

CC library to a subject, and identifying expressed peptides which 

CC home to the desired organ or tissue, e.g. brain, kidney, angiogenic 

CC vascular tissue or tumour tissue. The isolated peptides (see 

CC AAW13412-52, AAW11181-86) can be used to target e.g. drugs, toxins or 

CC labels to the selected organ/tissue (claimed) or to identify and/or 

CC isolate target molecules (claimed) . The peptides can be directly 

CC identified in vivo, as compared to prior art in vitro screening 

CC methods, which require further examination to see if they maintain 

CC specificity in vivo. 

XX 

SQ Sequence 9 AA; 

Query Match 100.0%; Score 67; DB 18; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 



QY 



1 CENWWGDVC 9 



Db 1 CENWWGDVC 9 



RESULT 2 
AAB07388 

ID AAB07388 standard; peptide; 9 AA. 
XX 

AC AAB07388; 
XX 

DT 17-OCT-2000 (first entry) 
XX 

DE Brain horning peptide # 2. 
XX 

KW Brain; homing peptide; organ targeting; tissue targeting; mouse; cyclic. 
XX 

OS Mus sp. 
XX 

FH Key Location/Qualifiers 
FT Disulf ide-bond 1..9 

FT /note= "Can optionally form a cyclic peptide" 

XX 

PN US6068829-A. 
XX 

PD 30-MAY-2000. 
XX 

PF 23-JUN-1997; 97US-0862855 . 
XX 

PR ll-SEP-1995; 95US- 0526710 . 
PR 10-MAR-1997; 97US-0813273 . 
XX 

PA (BURN-) BURNHAM INST . 
XX 

PI Pasqualini R, Ruoslahti E; 
XX 

DR WPI; 2000-410850/35. 
XX 

PT Identifying and recovering organ homing molecules or peptides by in 
PT vivo panning comprises administering a library of diverse peptides 
PT linked to a tag which facilitates recovery of these peptides 
XX 

PS Example 2; Column 17; 20pp; English. 
XX 

CC The present sequence is a mouse brain homing peptide. This sequence was 
CC identified by using in vivo panning to screen a library of potential 
CC organ homing molecules. The present sequence can be used to direct a 
CC moiety to a the brain tissue, by linking the moiety to the present 
CC sequence. Examples of potential moieties are drugs, toxins or a 
CC detectable label. 
XX 

SQ Sequence 9 AA; 

Query Match 100.0%; Score 67; DB 21; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps i 



Qy 



1 CENWWGDVC 9 



Db 1 CENWWGDVC 9 

RESULT 3 
AAE11794 

ID AAE11794 standard; peptide; 9 AA. 
XX 

AC AAE11794; 
XX 

DT 18-DEC-2001 (first entry) 
XX 

DE Phage peptide #2 targetted to brain. 
XX 

KW Enriched library fraction; brain; kidney; tumour; panning; diagnostic; 

KW molecular medicine; drug delivery; peptidomimetic ; pharmaceutical. 
XX 

OS Bacteriophage. 
XX 

PN US6296832-B1. 
XX 

PD 02-OCT-2001. 
XX 

PF 08-JAN-1999; 99US- 0226985 . 
XX 

PR 23-JUN-1997; 97US-0862855 . 

PR ll-SEP-1995; 95US- 0526710 . 

PR 10-MAR-1997; 97US- 08 132 73 . 
XX 

PA (BURN- ) BURNHAM INST. 
XX 

PI Ruoslahti E, Pasqualini R; 
XX 

DR WPI; 2001-610691/70. 
XX 

PT Enriched library fraction comprising molecules recovered by in vivo 

PT panning that selectively home to a selected organ or tissue useful for 

PT treating disease or in diagnostic methods 
XX 

PS Example 2; Column 17; 21pp; English. 
XX 

CC The invention relates to an enriched library fraction containing 

CC molecules that selectively home to a selected organ or tissue such as 

CC brain, kidney or tumour recovered by in vivo panning. The invention 

CC generally relates to the field of molecular medicine, drug delivery and 

CC to a method of invivo panning for identifying a molecule that homes to a 

CC specific organ. The molecules, e.g., peptides, peptidomimetics , proteins 

CC and fragments of proteins contained in an enriched library fraction may 

CC be administered to a subject as part of a pharmaceutical composition to 

CC treat disease or in diagnostic methods. The present sequence is a 

CC peptide from bacteriophage targetted to brain. 

XX 

SQ Sequence 9 AA; 



Query Match 100.0%; Score 67; DB 22; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



Qy 1 CENWWGDVC 9 

Illllllll 

Db 1 CENWWGDVC 9 

RESULT 4 
AAU10705 

ID AAU10705 standard; peptide; 9 AA. 
XX 

AC AAU10705; 
XX 

DT 12-MAR-2002 (first entry) 
XX 

DE Brain homing peptide #2 useful for delivery of target molecules. 
XX 

KW Organ targeting; tissue targeting; cancer; tumour homing molecule; 

KW delivery of target molecule; brain homing peptide. 

XX 

OS Synthetic. 
XX 

PN US6306365-B1. 
XX 

PD 23-OCT-2001. 
XX 

PF 08-JAN-1999; 99US-0227906 . 
XX 

PR 23-JUN-1997; 97US- 08 62 8 55 . 

PR ll-SEP-1995; 95US- 0526710 . 

PR 10-MAR-1997; 97US-0813273 . 
XX 

PA (BURN-) BURNHAM INST. 
XX 

PI Ruoslahti E, Pasqualini R; 
XX 

DR WPI; 2002-040196/05. 
XX 

PT Recovering molecules that home to an organ or tissue, useful for 

PT identifying molecules that home to a specific organ or tissue, e.g. 

PT identifying a tumour homing molecule to identify the presence of cancer, 

PT by in vivo panning of a library - 

XX 

PS Example 2; Column 17; 21pp; English. 
XX 

CC The present invention relates to a method of recovering molecules that 

CC home to a selected organ or tissue. The method comprises administering 

CC to the subject the library of diverse molecules, collecting a sample of 

CC the selected organ or tissue (e.g. brain or kidney), and recovering from 

CC the sample several molecules that home to the selected organ or tissue. 

CC The method is useful for identifying molecules, particularly useful for 

CC screening large number of molecules (e.g. peptides) , that home to a 

CC specific organ. The identified molecule is useful for e.g. raising an 

CC antibody specific for a target molecule, targeting a desired moiety 

CC (e.g. drug, toxin or detectable label) to the selected organ. 

CC Specifically, the method is useful for identifying the presence of cancer 

CC in a subject by linking an appropriate moiety to a tumour homing 

CC molecule. The present method provides a direct means for identifying 



CC molecules that specifically home to a selected organ and, therefore 

CC provides a significant advantage over previous methods, which require 

CC that a molecule identified using an in vitro screening method 

CC subsequently be examined to determine if it maintains its specificity 

CC vivo. AAU10704-AAU10723 represent brain homing peptides described in 

CC the present invention. 

XX 

SQ Sequence 9 AA; 

Query Match 100.0%; Score 67; DB 23; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 1 CENWWGDVC 9 

MINIMI 
Db 1 CENWWGDVC 9 



RESULT 5 


ABU5953 0 


ID 


ABU59530 standard; Peptide; 9 AA. 


XX 




AC 


ABU5 953 0; 


XX 




DT 


22-APR-2003 (first entry) 


XX 




DE 


Brain receptor targeting peptide #2, 


XX 




KW 


Targeting ligand; bioactive agent; polymer matrix; cancer; cytostatic 


KW 


cathepsin-D substrate; peptides; neuroreceptor; adrenal receptor; 


KW 


fibronectin; vitronectin; integrin; RGD motif; angiogenic endothelium 


KW 


tumour; cat ionic cancer-targeting peptide. 


XX 




OS 


Synthetic . 


XX 




PN 


US2002041898-A1 . 


XX 




PD 


ll-APR-2002. 


XX 




PF 


25-JUL-2001; 2001US-0912609 . 


XX 




PR 


05-JAN-2000; 2 000US-O478124 . 


PR 


31-OCT-2000; 2 0 00US- 07034 74 . 


XX 




PA 


(UNGE/) UNGER E C. 


PA 


(MATS/) MATSUNAGA T 0. 


PA 


(RAMA/) RAMASWAMI V. 


PA 


(ROMA/) ROMANOWSKI M J. 


XX 




PI 


Unger EC, Matsunaga TO, Ramaswami V, Romanowski MJ; 


XX 




DR 


WPI; 2003-208921/20. 


XX 




PT 


Targeted delivery system comprising a bioactive agent homogeneously 


PT 


dispersed in a targeted matrix is especially useful in cancer therapy 


PT 




XX 





PS Claim 23; Page 37; 46pp; English. 
XX 

CC The invention relates to a composition comprising a bioactive agent 

CC homogeneously dispersed in a targeted matrix (polymer and targeting 

CC ligand) . Also included are a targeted matrix for use as a delivery 

CC vehicle comprising a polymer associated with a targeting ligand, 

CC enhancing the bioavailability of an agent comprising administration 

CC of the composition and treating cancer comprising administration of the 

CC novel composition. The method is useful for targeted delivery of a drug, 

CC especially in cancer therapy. The targeting ligand may be a peptide. 

CC Examples of targeting peptides are disclosed including cathepsin-D 

CC substrate peptides, peptides targeting receptors in the brain and 

CC kidney, peptides recognising fibronectin- and vitronectin-binding 

CC integrins, peptides targeting the RGD (Arg-Gly-Asp) -motif in, e.g., 

CC antibodies, peptides targeting the angiogenic endothelium of solid 

CC tumours, tissue specific peptides (e.g. of lung, skin, pancreas, 

CC intestine, uterus, adrenal gland and retina), and cationic cancer- 

CC targeting peptides. The present sequence is a peptide targeting 

CC ligand disclosed in the invention. 

XX 

SQ Sequence 9 AA; 

Query Match 100.0%; Score 67; DB 24; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 1 CENWWGDVC 9 

Illllllll 
Db 1 CENWWGDVC 9 



RESULT 6 




ABG25219 




ID 


ABG25219 standard; Protein; 83 AA. 




XX 






AC 


ABG25219; 




XX 






DT 


18-FEB-2002 (first entry) 




XX 






DE 


Novel human diagnostic protein #25210. 




XX 






KW 


Human; chromosome mapping; gene mapping; gene therapy; 


forensic; 


KW 


food supplement; medical imaging; diagnostic; genetic 


disorder. 


XX 






OS 


Homo sapiens. 




XX 






PN 


WO200175067-A2 . 




XX 






PD 


ll-OCT-2001. 




XX 






PF 


30-MAR-2001; 2 001WO-US08631 . 




XX 






PR 


31-MAR-2000; 2000US-0540217 . 




PR 


23-AUG-2000; 2 000US-0649167 . 




XX 






PA 


(HYSE-) HYSEQ INC. 




XX 







PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR N-PSDB; AAS89406. 
XX 

PT New isolated polynucleotide and encoded polypeptides, useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity - 
XX 

PS Claim 20; SEQ ID No 55578/ 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and 

CC polypeptide (II) sequences. (I) is useful as hybridisation probes, 

CC polymerase chain reaction (PCR) primers, oligomers, and for chromosome 

CC and gene mapping, and in recombinant production of (II) . The 

CC polynucleotides are also used in diagnostics as expressed sequence tags 

CC for identifying expressed genes. (I) is useful in gene therapy techniques 

CC to restore normal activity of (II) or to treat disease states involving 

CC (II) . (II) is useful for generating antibodies against it, detecting or 

CC quantitating a polypeptide in tissue, as molecular weight markers and as 

CC a food supplement. (II) and its binding partners are useful in medical 

CC imaging of sites expressing (II). (I) and (II) are useful for treating 

CC disorders involving aberrant protein expression or biological activity. 

CC The polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. ABG00010-ABG30377 represent novel human 

CC diagnostic amino acid sequences of the invention. 

CC Note: The sequence data for this patent did not appear in the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp.wipo.int/pub/published_pct_sequences. 
XX 

SQ Sequence 83 AA; 

Query Match 70.1%; Score 47; DB 22; Length 83; 

Best Local Similarity 85.7%; Pred. No. 7.9; 

Matches 6; Conservative 0; Mismatches 1; Indels 0; Gaps 0 



Qy 3 NWWGDVC 9 

I I I I II 
Db 33 NWWGSVC 3 9 



RESULT 7 
ABBS 9 63 2 

ID ABB59632 standard; Protein; 252 AA. 
XX 

AC ABBS 9 63 2; 
XX 

DT 26-MAR-2002 (first entry) 
XX 

DE Drosophila melanogaster polypeptide SEQ ID NO 5688. 
XX 

KW Drosophila; developmental biology; cell signalling; insecticide; 
KW pharmaceutical. 



XX 

OS Drosophila melanogaster . 
XX 

PN WO200171042-A2. 
XX 

PD 27-SEP-2001. 
XX 

PF 23-MAR-2001; 2 001WO-US0 923 1 . 
XX 

PR 23-MAR-2000; 2 000US-191637P . 

PR ll-JUL-2000; 2 00 0US- 0614 15 0 . 
XX 

PA (PEKE ) PE CORP NY. 
XX 

PI Venter JC, Adams M, Li PWD, Myers EW; 
XX 

DR WPI; 2001-656860/75. 

DR N-PSDB; ABL03735. 
XX 

PT New isolated nucleic acid detection reagent for detecting 1000 or more 

PT genes from Drosophila and for elucidating cell signalling and cell-cell 

PT interactions - 
XX 

PS Disclosure; SEQ ID NO 5688; 21pp + Sequence Listing; English. 
XX 

CC The invention relates to an isolated nucleic acid detection reagent 

CC capable of detecting 1000 or more genes from Drosophila. The invention 

CC useful in developmental biology and in elucidating cell signalling and 

CC cell -cell interactions in higher eukaryotes for the development of 

CC insecticides, therapeutics and pharmaceutical drugs. The invention 

CC discloses genomic DNA sequences (ABL16176-ABL30511) , expressed DNA 

CC sequences (ABL0184 0-ABL16175) and the encoded proteins 

CC (ABB57737-ABB72072) . 

CC The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp.wipo.int/pub/published_pct_sequences. 

XX 

SQ Sequence 252 AA; 

Query Match 70.1%; Score 47; DB 22; Length 252; 

Best Local Similarity 75.0%; Pred. No. 23; 

Matches 6; Conservative 1; Mismatches 1; Indels 0; Gaps 
Qy 2 ENWWGDVC 9 

Db 138 ENWWANVC 14 5 



RESULT 8 
AAU60753 

ID AAU60753 standard; Protein; 61 AA. 
XX 

AC AAU60753; 
XX 

DT 27-FEB-2002 (first entry) 
XX 

DE Propionibacterium acnes immunogenic protein #21649. 



XX 

KW SAPHO syndrome; synovitis; acne; pustulosis; hypertosis; osteomyelitis; 

KW uveitis; endophthalmitis; bone; joint; central nervous system; ELISA; 

KW inflammatory lesion; acne vulgaris; enzyme linked immunosorbent assay; 

KW dermatological ; osteopathic; neuroprotectant . 
XX 

OS Propionibacterium acnes, 
XX 

PN WO200181581-A2 . 
XX 

PD 01-NOV-2001. 
XX 

PF 20-APR-2001; 2 0 01WO-US128 65 . 
XX 

PR 21-APR-2000; 2000US-199047P . 

PR 02-JUN-2000; 2000US-208841P . 

PR 07-JUL-2000; 2000US-216747P . 
XX 

PA (C0R1-) CORIXA CORP. 
XX 

PI Skeiky YAW, Parsing DH, Mitcham JL, Wang SS, Bhatia A; 

PI L'maisonneuve J, Zhang Y, Jen S, Carter D; 

XX 

DR WPI; 2001-616774/71. 

DR N-PSDB; AAS59612. 
XX 

PT Propionibacterium acnes polypeptides and nucleic acids useful for 

PT vaccinating against and diagnosing infections, especially useful for 

PT treating acne vulgaris - 
XX 

PS Example 1; SEQ ID No 21948; 1069pp; English. 
XX 

CC Sequences AAU3 9105-AAU68 017 represent Propionibacterium acnes immunogenic 

CC polypeptides. The proteins and their associated DNA sequences are used in 

CC the treatment, prevention and diagnosis of medical conditions caused by 

CC P. acnes. The disorders include SAPHO syndrome (synovitis, acne, 

CC pustulosis, hypertosis and osteomyelitis), uveitis and endophthalmitis. 

CC P. acnes is also involved in infections of bone, joints and the central 

CC nervous system, however it is particularly involved in the inflammatory 

CC lesions associated with acne vulgaris. A method for detecting the 

CC presence or absence of P. acnes in a patient comprises contacting a 

CC sample with a binding agent that binds to the proteins of the invention 

CC and determining the amount of bound protein in the sample. The 

CC polypeptides may be used as antigens in the production of antibodies 

CC specific for P . acnes proteins. These antibodies can be used to 

CC downregulate expression and activity of P. acnes polypeptides and 

CC therefore treat P. acnes infections. The antibodies may also be used as 

CC diagnostic agents for determining P. acnes presence, for example, by 

CC enzyme linked immunosorbent assay (ELISA) . 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp. wipe int/pub/published_pct__sequences . 
XX 

SQ Sequence 61 AA; 



Query Match 65.7%; Score 44; DB 22; Length 61; 

Best Local Similarity 66.7%; Pred. No. 16; 



Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 



Qy 1 CENWWGDVC 9 

I hi III 

Db 4 9 CANYWQDVC 57 

RESULT 9 
AAE32727 

ID AAE32727 standard; Protein; 1050 AA. 
XX 

AC AAE32727; 
XX 

DT 24-MAR-2003 (first entry) 
XX 

DE KIAA0032 protein. 

KW Viral infection; lymphosarcoma; human immunodeficiency virus; hepatitis; 

KW poliomyelitis; HIV; measles; protein therapy; KIAA0032 protein. 

XX 

OS Unidentified. 
XX 

FH Key Locat ion/Qual i f i ers 

FT Domain 52.. 102 

FT /note= "RCC1 domain" 

FT Domain 1012.. 1047 

FT /note= "HECT domain" 

XX 

PN WO200290549-A2 . 
XX 

PD 14-NOV-2002 . 
XX 

PF 12-MAR-2002; 2002WO-IB02106 . 
XX 

PR 12-MAR-2001; 2001US-275224P . 

PR 31-JUL-2001; 2001US-308 958P . 

PR 07-DEC-2001; 2 0 01US-34 017 OP . 
XX 

PA (PROT-) PROTEOLOGICS LTD. 
XX 

PI Greener T, Moskowitz H, Reiss Y, Alroy I; 
XX 

DR WPI; 2003-111976/10. 

DR N-PSDB; AAD50461. 
XX 

PT New protein complex comprising HECT-RCC1, viral maturation scaffolding 

PT protein (VMSP) , and/or HIV gag protein, useful for treating viral 

PT infections, such lymphosarcoma, HIV, hepatitis, poliomyelitis, measles, 

PT or Ebola 

XX 

PS Disclosure; Fig 24; 150pp; English. 
XX 

CC The invention relates to a method for modulation of viral maturation. 

CC The invention also provides an isolated protein complex comprising a 

CC HECT-RCC1 selected from HETT-WW, HECT-RCC1, Gag protein, Gag late 

CC domain, P13, actin, myosin, Hsp60, Hsp70, Hsp90, STAM1, STAM2A, STAM2B, 

CC VHS-UIM, GTPase, E2 enzyme, tsglOl, cullin, HERC1, HERC2 , HERC3 , Nedd4 



CC -like protein or clathrin. The complexes, proteins, antibodies and 

CC methods are useful for treating viral infections, such as lymphosarcoma, 

CC human immunodeficiency virus (HIV), hepatitis, poliomyelitis, measles, 

CC or Ebola and for inhibiting budding in a subject. They are also useful 

CC in diagnostic assays for determining whether a cell is infected with a 

CC virus and for characterising the nature, progression and/or infectivity 

CC of the infection. The invention is also useful in protein therapy. The 

CC present sequence is KIAA0032 protein used to illustrate the method of 

CC the invention. 
XX 

SQ Sequence 1050 AA; 

Query Match 64.2%; Score 43; DB 24; Length 1050; 

Best Local Similarity 62.5%; Pred. No. 3.5e+02; 

Matches 5; Conservative 1; Mismatches 2; Indels 0; Gaps 

Qy 2 ENWWGDVC 9 

:||| II 
Db 545 DNWWSQVC 552 



RESULT 10 
AAE32731 

ID AAE32731 standard; Protein; 1050 AA. 
XX 

AC AAE32731; 
XX 

DT 24-MAR-2003 (first entry) 
XX 

DE HERC3 protein (varl) . 
XX 

KW Viral infection; lymphosarcoma; human immunodeficiency virus; hepatitis 

KW poliomyelitis; HIV; measles; protein therapy; HERC3 protein. 

XX 

OS Unidentified. 
XX 

PN WO200290549-A2 . 
XX 

PD 14-NOV-2002 . 
XX 

PF 12-MAR-2002; 2 002WO-IB02106 . 
XX 

PR 12-MAR-2001; 2 001US-275224P . 

PR 31-JUL-2001; 2001US-308958P . 

PR 07-DEC-2001; 2001US-340170P . 
XX 

PA ( PROT- ) PROTEOLOGICS LTD. 
XX 

PI Greener T, Moskowitz H, Reiss Y, Alroy I; 
XX 

DR WPI; 2003-111976/10. 

DR N-PSDB; AAD50465. 
XX 

PT New protein complex comprising HECT-RCC1, viral maturation scaffolding 

PT protein (VMSP) , and/or HIV gag protein, useful for treating viral 

PT infections, such lymphosarcoma, HIV, hepatitis, poliomyelitis, measles, 

PT or Ebola 



XX 

PS Claim 36; Fig 28; 150pp; English. 
XX 

CC The invention relates to a method for modulation of viral maturation, 

CC The invention also provides an isolated protein complex comprising a 

CC HECT-RCC1 selected from HETT-WW, HECT-RCC1, Gag protein, Gag late 

CC domain, P13, actin, myosin, HspGO, Hsp70, Hsp90, STAM1, STAM2A, STAM2B , 

CC VHS-UIM, GTPase, E2 enzyme, tsglOl, cull in, HERC1, HERC2 , HERC3, Nedd4 

CC -like protein or clathrin. The complexes, proteins, antibodies and 

CC methods are useful for treating viral infections, such as lymphosarcoma, 

CC human immunodeficiency virus (HIV), hepatitis, poliomyelitis, measles, 

CC or Ebola and for inhibiting budding in a subject. They are also useful 

CC in diagnostic assays for determining whether a cell is infected with a 

CC virus and for characterising the nature, progression and/or infectivity 

CC of the infection. The invention is also useful in protein therapy. The 

CC present sequence is HERC3 protein used to illustrate the method of the 

CC invention. 

XX 

SQ Sequence 1050 AA; 

Query Match 64.2%; Score 43; DB 24; Length 1050; 

Best Local Similarity 62.5%; Pred. No. 3.5e+02; 

Matches 5; Conservative 1; Mismatches 2; Indels 0; Gaps 



Qy 


2 ENWWGDVC 9 


Db 


:||| II 
545 DNWWSQVC 552 


RESULT 11 


AAE32732 


ID 


AAE32732 standard; Protein; 1054 AA. 


XX 




AC 


AAE32732; 


XX 




DT 


24-MAR-2003 (first entry) 


XX 




DE 


HERC3 protein (var2) . 


XX 


Viral infection; lymphosarcoma; human immunodeficiency virus; hepatitis ; 


KW 


KW 


poliomyelitis; HIV; measles; protein therapy; HERC3 protein. 


XX 




OS 


Unidentified. 


XX 




PN 


WO200290549-A2 . 


XX 




PD 


14-NOV-2002 . 


XX 




PF 


12-MAR-2002; 2 002WO- IB02 106 . 


XX 




PR 


12-MAR-2001; 2001US-275224P . 


PR 


31-JUL-2001; 2001US-308958P . 


PR 


07-DEC-2001; 2001US-340170P . 


XX 




PA 


(PROT-) PROTEOLOGICS LTD. 


XX 




PI 


Greener T, Moskowitz H, Reiss Y, Alroy I; 



XX 

DR WPI; 2003-111976/10. 

DR N-PSDB; AAD50466. 
XX 

PT New protein complex comprising HECT-RCC1, viral maturation scaffolding 

PT protein (VMSP) , and/or HIV gag protein, useful for treating viral 

PT infections, such lymphosarcoma, HIV, hepatitis, poliomyelitis, measles, 

PT or Ebola 

XX 

PS Claim 36; Fig 29; 15 0pp; English. 
XX 

CC The invention relates to a method for modulation of viral maturation. 

CC The invention also provides an isolated protein complex comprising a 

CC HECT-RCC1 selected from HETT-WW, HECT-RCC1, Gag protein, Gag late 

CC domain, P13, actin, myosin, Hsp60, Hsp70, Hsp90, STAM1, STAM2A, STAM2B, 

CC VHS-UIM, GTPase, E2 enzyme, tsglOl, cull in # HERC1, HERC2 , HERC3 , Nedd4 

CC -like protein or clathrin. The complexes, proteins, antibodies and 

CC methods are useful for treating viral infections, such as lymphosarcoma, 

CC human immunodeficiency virus (HIV), hepatitis, poliomyelitis, measles, 

CC or Ebola and for inhibiting budding in a subject. They are also useful 

CC in diagnostic assays for determining whether a cell is infected with a 

CC virus and for characterising the nature, progression and/or infectivity 

CC of the infection. The invention is also useful in protein therapy. The 

CC present sequence is HERC3 protein used to illustrate the method of the 

CC invention. 

XX 

SQ Sequence 1054 AA; 

Query Match 64.2%; Score 43; DB 24; Length 1054; 

Best Local Similarity 62.5%; Pred. No. 3.5e+02; 

Matches 5; Conservative 1; Mismatches 2; Indels 0; Gaps 



Qy 


2 ENWWGDVC 9 


Db 


II 

54 9 DNWWSQVC 556 


RESULT 12 


AAU59363 


ID 


AAU59363 standard; Protein; 7 9 AA. 


XX 




AC 


AAU59363; 


XX 




DT 


27-FEB-2002 (first entry) 


XX 




DE 


Propionibacterium acnes immunogenic protein #20259. 


XX 




KW 


SAPHO syndrome; synovitis; acne; pustulosis; hypertosis; osteomyelitis 


KW 


uveitis; endophthalmitis; bone; joint; central nervous system; ELISA; 


KW 


inflammatory lesion; acne vulgaris; enzyme linked immunosorbent assay; 


KW 


dermatological ; osteopathic ; neuroprotectant . 


XX 




OS 


Propionibacterium acnes. 


XX 




PN 


WO200181581-A2. 


XX 




PD 


01-NOV-2001. 



XX 

PF 20-APR-2001; 2001WO-US12865 . 
XX 

PR 21-APR-2000; 2000US-199047P . 

PR 02-JUN-2000; 2000US-208841P . 

PR 07-JUL-2000; 2000US-216747P . 
XX 

PA (CORI-) CORIXA CORP. 
XX 

PI Skeiky YAW, Persing DH, Mitcham JL, Wang SS, Bhatia A; 

PI L'maisonneuve J, Zhang Y, Jen S, Carter D; 

XX 

DR WPI; 2001-616774/71. 

DR N-PSDB; AAS5 9602 . 
XX 

PT Propionibacterium acnes polypeptides and nucleic acids useful for 

PT vaccinating against and diagnosing infections, especially useful for 

PT treating acne vulgaris - 
XX 

PS Example 1; SEQ ID No 20558; 1069pp; English. 
XX 

CC Sequences AAU39105-AAU68017 represent Propionibacterium acnes immunogenic 

CC polypeptides. The proteins and their associated DNA sequences are used in 

CC the treatment, prevention and diagnosis of medical conditions caused by 

CC P. acnes. The disorders include SAPHO syndrome (synovitis, acne, 

CC pustulosis, hypertosis and osteomyelitis), uveitis and endophthalmitis. 

CC P. acnes is also involved in infections of bone, joints and the central 

CC nervous system, however it is particularly involved in the inflammatory 

CC lesions associated with acne vulgaris. A method for detecting the 

CC presence or absence of P. acnes in a patient comprises contacting a 

CC sample with a binding agent that binds to the proteins of the invention 

CC and determining the amount of bound protein in the sample. The 

CC polypeptides may be used as antigens in the production of antibodies 

CC specific for P. acnes proteins. These antibodies can be used to 

CC downregulate expression and activity of P. acnes polypeptides and 

CC therefore treat P. acnes infections. The antibodies may also be used as 

CC diagnostic agents for determining P. acnes presence, for example, by 

CC enzyme linked immunosorbent assay (ELISA) . 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp.wipo.int/pub/published_pct_sequences. 
XX 

SQ Sequence 79 AA; 

Query Match 62.7%; Score 42; DB 22; Length 79; 

Best Local Similarity 66.7%; Pred. No, 41; 

Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 0 
Qy 1 CENWWGDVC 9 

Ih I Ml 

Db 43 CESAWSDVC 51 



RESULT 13 
AAB94435 

ID AAB94435 standard; Protein; 755 AA. 
XX 



AC AAB94435; 
XX 

DT 26-JUN-2001 (first entry) 
XX 

DE Human protein sequence SEQ ID NO: 15056. 
XX 

KW Human; primer; detection; diagnosis; antisense therapy; gene therapy. 
XX 

OS Homo sapiens . 
XX 

PN EP1074617-A2 . 
XX 

PD 07-FEB-2001. 
XX 

PF 28-JUL-2000; 2000EP-0116126 . 
XX 

PR 29-JUL-1999; 99JP-0248036 . 

PR 27-AUG-1999; 99JP-0300253 . 

PR ll-JAN-2000; 2000JP-0118776 . 

PR 02-MAY-2000; 2000JP-0183767 . 

PR 09-JUN-2000; 2000JP-0241899 . 
XX 

PA (HELI-) HELIX RES INST. 
XX 

PI Ota T, Isogai T, Nishikawa T, Hayashi K, Saito K, Yamamoto J; 

PI Ishii S, Sugiyama T, Wakamatsu A, Nagai K, Otsuki T; 

XX 

DR WPI; 2001-318749/34. 
XX 

PT Primer sets for synthesizing polynucleotides, particularly the 5602 

PT full-length cDNAs defined in the specification, and for the detection 

PT and/or diagnosis of the abnormality of the proteins encoded by the 

PT full-length cDNAs - 
XX 

PS Claim 8; SEQ ID 15056; 2537pp + CD ROM; English. 
XX 

CC The present invention describes primer sets for synthesising 5602 

CC full-length cDNAs defined in the specification. Where a primer set 

CC comprises: (a) an oligo-dT primer and an oligonucleotide complementary 

CC to the complementary strand of a polynucleotide which comprises one of 

CC the 5602 nucleotide sequences defined in the specification, where the 

CC oligonucleotide comprises at least 15 nucleotides; or (b) a combination 

CC of an oligonucleotide comprising a sequence complementary to the 

CC complementary strand of a polynucleotide which comprises a 5 ' -end 

CC sequence and an oligonucleotide comprising a sequence complementary to a 

CC polynucleotide which comprises a 3' -end sequence, where the 

CC oligonucleotide comprises at least 15 nucleotides and the combination of 

CC the 5 '-end sequence/3 ' -end sequence is selected from those defined in 

CC the specification. The primer sets can be used in antisense therapy and 

CC in gene therapy. The primers are useful for synthesising polynucleotides, 

CC particularly full-length cDNAs . The primers are also useful for the 

CC detection and/or diagnosis of the abnormality of the proteins encoded by 

CC the full-length cDNAs . The primers allow obtaining of the full-length 

CC cDNAs easily without any specialised methods. AAH03166 to AAH13628 and 

CC AAH13633 to AAH18742 represent human cDNA sequences; AAB92446 to 

CC AAB95893 represent human amino acid sequences; and AAH13629 to AAH13632 

CC represent oligonucleotides, all of which are used in the exemplification 



CC of the present invention. 
XX 

SQ Sequence 755 AA; 

Query Match 62.7%; Score 42; DB 22; Length 755; 

Best Local Similarity 100.0%; Pred. No. 3.6e+02; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 
Qy 1 CENWW 5 

Mill 

Db 327 CENWW 331 



RESULT 14 
ABP69413 

ID ABP69413 standard; Protein; 755 AA. 
XX 

AC ABP69413; 
XX 

DT 20-JAN-2003 (first entry) 
XX 

DE Human polypeptide SEQ ID NO 1460, 
XX 

KW Human; genome mapping; gene therapy; food supplement; virus; fungus; 

KW cell-proliferative disorder; neurodegenerative disease; bacterial; 

KW Parkinson's disease; Alzheimer's disease; autoimmune disease; 

KW multiple sclerosis; diabetes; genetic disorder; wound; burn; infection; 

KW arthritis; cytostatic; immunomodulator ; nootropic; neuroprotective; 

KW antiparkinsonian; antidiabetic; immunosuppressive; dermatological ; 

KW haemostatic; vulnerary; fungicide; antibacterial; virucide; protozoacide; 

KW antiarthritic . 

XX 

OS Homo sapiens. 
XX 

PN WO200270539-A2 . 
XX 

PD 12-SEP-2002. 
XX 

PF 05-MAR-2002; 2 0 02WO-US05 095 . 
XX 

PR 05-MAR-2001; 2001US- 0799451 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Tang YT, Zhou P, Goodrich RW, Asundi V, Zhang J, Zhao QA, Ren F; 

PI Xue AJ, Yang Y, Ma Y, Yamazaki V, Chen R, Wang Z, Ghosh M; 

PI Wehrman T, Wang J, Wang D, Drmanac RT; 
XX 

DR WPI; 2002-759812/82. 

DR N-PSDB; ABZ11630. 
XX 

PT New polynucleotides comprising sequences assembled from expressed 

PT sequence tags (ESTs) , useful for treating cell-proliferative, 

PT neurodegenerative, autoimmune, genetic, myeloid or lymphoid, or 

PT platelet or coagulation disorders 
XX 

PS Claim 9; SEQ ID NO 1460; 1012pp + Sequence Listing; English. 



XX 

CC The invention relates to an isolated polynucleotide (I) comprising a 

CC nucleotide sequence selected from any of 94 8 sequences 

CC (ABZ11119-ABZ12066) or their mature protein coding portion, active domain 

CC coding protein or complementary sequences. The polynucleotides are useful 

CC for identifying expressed genes or for physical mapping of human genome. 

CC The encoded polypeptides (ABP68902-ABP69849) are useful as molecular 

CC weight markers, as a food supplement, for generating antibodies, in 

CC medical imaging, screening and diagnostic assays and for treating 

CC cell -proliferative disorders (cancer) , neurodegenerative diseases 

CC (Parkinson's or Alzheimer 1 s disease), autoimmune diseases (multiple 

CC sclerosis, diabetes, lupus) genetic disorders, myeloid or lymphoid 

CC disorders, platelet or coagulation disorders, wound, burns, incision, 

CC ulcers, liver or lung fibrosis, infections (bacterial, viral, fungal, 

CC parasitic), arthritis, etc. 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp.wipo.int/pub/published_pct_sequences. 
XX 

SQ Sequence 755 AA; 

Query Match 62,7%; Score 42; DB 23; Length 755; 

Best Local Similarity 100.0%; Pred. No. 3.6e+02 ; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



Qy 


1 CENWW 5 




Db 


Mill 

327 CENWW 331 




RESULT 15 




ABJ05495 




ID 


ABJ05495 standard; Protein; 1088 AA. 




XX 






AC 


ABJ05495; 




XX 






DT 


14-NOV-2002 (first entry) 




XX 






DE 


Human breast cancer associated polypeptide SEQ ID NO: 


255. 


XX 






KW 


Human; breast specific gene; breast specific protein; 


breast cancer; 


KW 


gene therapy; cytostatic. 




XX 






OS 


Homo sapiens. 




XX 






PN 


WO200264611-A1. 




XX 






PD 


22-AUG-2002. 




XX 






PF 


12-FEB-2002; 2 002WO-US04197 . 




XX 






PR 


13-FEB-2001; 2 001US-268292P . 




XX 






PA 


(DIAD-) DIADEXUS INC. 




XX 
PI 


Salceda S, Macina RA, Hu P, Recipon H, Karra K, 


Cafferkey R; 


PI 


Sun Y, Liu C; 





XX 

DR WPI; 2002-657582/70. 
XX 

PT New breast specific nucleic acids and proteins, useful for identifying, 

PT diagnosing, monitoring, staging, imaging, and treating breast cancer 

PT and non-cancerous disease states in breast tissue, and in gene therapy 
PT 
XX 

PS Claim 11; Page 330-334; 367pp; English. 
XX 

CC The present invention provides human breast specific coding sequences and 

CC proteins. These can be used in the diagnosis and treatment of breast 

CC cancer and non-cancerous diseases of the breast. The present sequence is 

CC a polypeptide of the invention. 

XX 

SQ Sequence 1088 AA; 

Query Match 62.7%; Score 42; DB 23; Length 1088; 

Best Local Similarity 100.0%; Pred. No. 5.1e+02; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 1 CENWW 5 

Mill 

Db 660 CENWW 664 



Search completed: November 13, 2003, 09:45:22 
Job time : 31.2812 sees 

GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



November 13, 2003, 09:45:35 ; Search time 18.6562 Seconds 

(without alignments) 
88.069 Million cell updates/sec 

US-09-228-866-2 
67 

1 CENWWGDVC 9 
BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 666188 seqs, 182559486 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



666188 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : Published_Applications_AA: * 

1 : /cgn2_6/ptodata/2/pubpaa/US07_PUBCOMB.pep:* 

2 : /cgn2_6/ptodata/2/pubpaa/PCT_NEW_PUB.pep: * 

3 : /cgn2_6/ptodata/2/pubpaa/US06_NEW_PUB.pep: * 

4 : /cgn2_6/ptodata/2/pubpaa/US06_PUBCOMB.pep:* 

5 : /cgn2_6/ptodata/2/pubpaa/US07_NEW_PUB.pep: * 

6 : /cgn2_6 /ptodata/2/pubpaa/PCTUS_PUBC0MB .pep : * 

7 : /cgn2__6/ptodata/2/pubpaa/US08_NEW_PUB.pep:* 

8 : /cgn2_6/ptodata/2/pubpaa/US08_PUBCOMB.pep: * 

9 : / cgn2_6 /p t oda t a / 2 /pubpaa/US 0 9A__PUBC0MB . pep : * 
10 : /cgn2_6/ptodata/2/pubpaa/US09B_PUBCOMB .pep : * 
11: / cgn2_6 /pt oda t a /2 /pubpaa /US 0 9 C_PUBCOMB . pep : * 
12 : /cgn2_6/ptodata/2/pubpaa/US0 9_NEW_PUB.pep:* 
13 : / cgn2_6 /ptodata/2 /pubpaa/US 1 0A_PUBCOMB . pep : * 
14 : /cgn2_6/ptodata/2/pubpaa/US10B_PUBCOMB.pep: * 
15: / cgn2_6 /p t oda t a / 2 /pubpaa /US 1 0CJPUBCOMB . pep : * 
16: / cgn2_6 /pt oda t a / 2 /pubpaa /US 1 0_NEW_PUB . pep : * 
17 : /cgn2_6/ptodata/2/pubpaa/US60_NEW_PUB .pep : * 
18: / cgn2_6 /pt oda t a / 2 /pubpaa/US 6 0_PUBCOMB . pep : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

o, 
"5 



Result 
No. 


Score 


Query 

Match Length 


DB 


ID 


Description 


1 


43 


64. 


2 


1050 


15 


US-10-097-534-24 


Sequence 24, Appl 


2 


43 


64. 


2 


1050 


15 


US-10-097-534-28 


Sequence 28, Appl 


3 


43 


64. 


2 


1054 


15 


US-10-097-534-29 


Sequence 29, Appl 


4 


42 


62. 


7 


1088 


14 


US-10-001-887-127 


Sequence 12 7, App 


5 


42 


62. 


7 


1088 


15 


US-10-074-475-255 


Sequence 255, App 


6 


40 


59. 


7 


212 


10 


US-09-738-626-3646 


Sequence 3646, Ap 


7 


40 


59. 


7 


243 


15 


US-10-097-340-117 


Sequence 117, App 


8 


40 


59 


7 


250 


12 


US-10-203-708-41 


Sequence 41, Appl 


9 


40 


59 


7 


257 


15 


US-10-097-340-111 


Sequence 111, App 


10 


40 


59 


7 


372 


12 


US-10-189-977-13 


Sequence 13, Appl 


11 


40 


59 


7 


372 


14 


US-10-120-319-13 


Sequence 13, Appl 


12 


40 


59 


7 


424 


9 


US-09-733-524-16 


Sequence 16, Appl 


13 


40 


59 


7 


425 


12 


US-10-189-977-6 


Sequence 6, Appli 


14 


40 


59 


7 


425 


12 


US-10-392-098-6 


Sequence 6, Appli 


15 


40 


59 


7 


425 


14 


US-10-120-319-6 


Sequence 6, Appli 


16 


40 


59 


7 


454 


9 


US-09-733-524-18 


Sequence 18, Appl 


17 


40 


59 


7 


454 


12 


US-10-189-977-8 


Sequence 8, Appli 


18 


40 


59 


.7 


454 


12 


US-10-392-098-8 


Sequence 8, Appli 


19 


40 


59 


.7 


454 


14 


US-10-120-319-8 


Sequence 8, Appli 


20 


40 


59 


.7 


455 


12 


US-10-460-125-2 


Sequence 2, Appli 


21 


40 


59 


.7 


464 


12 


US-10-189-977-1 


Sequence 1, Appli 


22 


40 


59 


.7 


464 


12 


US-10-392-098-1 


Sequence 1, Appli 


23 


40 


59 


.7 


464 


14 


US-10-120-319-1 


Sequence 1, Appli 


24 


40 


59 


.7 


476 


9 


US-09-733-524-15 


Sequence 15, Appl 


25 


40 


59 


.7 


476 


12 


US-10-189-977-5 


Sequence 5, Appli 


26 


40 


59 


.7 


476 


12 


US-10-392-098-5 


Sequence 5, Appli 


27 


40 


59 


.7 


476 


14 


US-10-120-319-5 


Sequence 5, Appli 


28 


40 


59 


.7 


478 


12 


US-10-189-977-7 


Sequence 7, Appli 



29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 



40 
40 
40 
40 
40 
40 
40 
40 
39 
39 
39 
39 
39 
39 
39 
38 
38 



59,7 
59.7 
59.7 
59.7 
59.7 
59.7 
59.7 
59.7 
58 .2 
58 .2 
58 .2 
58 .2 
58 .2 
58 .2 
58 .2 
56.7 
56.7 



478 
478 
479 
485 
486 
486 
486 
501 
12 
237 
285 
500 
790 
832 
853 
86 
86 



12 
14 

9 
9 

12 
12 
14 
9 

12 
10 
10 
10 
15 
12 
12 
10 
10 



US-10-392-098-7 
US-10-120-319-7 
US-09-733-524-17 
US-09-733-524-2 
US-10-189-977-2 
US-10-392-098-2 
US-10-120-319-2 
US-09-733-524-1 
US-10-190-082-556 
US-09-738-626-6721 
US-09-738-626-4153 
US-09-323-998D-59 
US-10-156-761-9515 
US-10-369-294-21 
US-10-369-294-13 
US-09-751-100B-52 
US-09-751-100B-53 



Sequence 7, Appli 
Sequence 7, Appli 
Sequence 17, Appl 
Sequence 2, Appli 
Sequence 2, Appli 
Sequence 2, Appli 
Sequence 2, Appli 
Sequence 1, Appli 
Sequence 556, App 
Sequence 6721, Ap 
Sequence 4153, Ap 
Sequence 59, Appl 
Sequence 9515, Ap 
Sequence 21, Appl 
Sequence 13, Appl 
Sequence 52, Appl 
Sequence 53, Appl 



ALIGNMENTS 



RESULT 1 

US-10-097-534-24 

; Sequence 24, Application US/10097534 

; Publication No. US20030049607A1 

; GENERAL INFORMATION: 

; APPLICANT: GREENER, TSVIKA 

; APPLICANT: MOSKOWITZ, HAIM 

; APPLICANT: REISS, YUVAL 

; APPLICANT: ALROY, IRIS 

; TITLE OF INVENTION: COMPOSITIONS AND METHODS FOR THE MODULATION OF VIRAL 
; TITLE OF INVENTION: MATURATION 
; FILE REFERENCE: PLV-001.01 

; CURRENT APPLICATION NUMBER: US/10/0 97 , 534 

; CURRENT FILING DATE: 2 002-03-12 

; PRIOR APPLICATION NUMBER: 60/275,224 

; PRIOR FILING DATE: 2001-03-12 

; PRIOR APPLICATION NUMBER: 60/308,958 

; PRIOR FILING DATE: 2001-07-31 

; PRIOR APPLICATION NUMBER: 60/34 0,170 

PRIOR FILING DATE: 2001-12-07 
; NUMBER OF SEQ ID NOS : 71 

SOFTWARE: Patent In Ver. 2.1 
; SEQ ID NO 24 
LENGTH: 1050 
TYPE : PRT 

ORGANISM: Homo sapiens 
US-10-097-534-24 

Query Match 64.2%; Score 43; DB 15; Length 1050; 

Best Local Similarity 62.5%; Pred. No. 3.7e+02; 

Matches 5; Conservative 1; Mismatches 2; Indels 0; Gaps 



Qy 

Db 



2 ENWWGDVC 9 

:||l II 
545 DNWWSQVC 552 



RESULT 2 

US-10-097-534-28 

; Sequence 28, Application US/10097534 

; Publication No. US20030049607A1 

; GENERAL INFORMATION: 

; APPLICANT: GREENER , TSVIKA 

; APPLICANT: MOSKOWITZ, HAIM 

; APPLICANT: REISS, YUVAL 

; APPLICANT: ALROY, IRIS 

; TITLE OF INVENTION: COMPOSITIONS AND METHODS FOR THE MODULATION OF VIRAL 
; TITLE OF INVENTION: MATURATION 
; FILE REFERENCE: PLV-001.01 

; CURRENT APPLICATION NUMBER: US/10/097 , 534 

; CURRENT FILING DATE: 2002-03-12 

; PRIOR APPLICATION NUMBER: 60/275,224 

; PRIOR FILING DATE: 2001-03-12 

; PRIOR APPLICATION NUMBER: 60/308,958 

; PRIOR FILING DATE: 2001-07-31 

; PRIOR APPLICATION NUMBER: 60/340,170 

; PRIOR FILING DATE: 2001-12-07 

; NUMBER OF SEQ ID NOS : 71 

; SOFTWARE: Patent In Ver. 2.1 

; SEQ ID NO 28 

LENGTH: 1050 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-097-534-28 

Query Match 64.2%; Score 43; DB 15; Length 1050; 

Best Local Similarity 62.5%; Pred. No. 3.7e+02; 

Matches 5; Conservative 1; Mismatches 2; Indels 0; Gaps 

Qy 2 ENWWGDVC 9 

:|M II 
Db 545 DNWWSQVC 552 



RESULT 3 

US-10-097-534-29 

; Sequence 29, Application US/10097534 

; Publication No. US20030049607A1 

; GENERAL INFORMATION: 

; APPLICANT: GREENER, TSVIKA 

; APPLICANT: MOSKOWITZ, HAIM 

; APPLICANT: REI SS, YUVAL 

; APPLICANT: ALROY, IRIS 

; TITLE OF INVENTION: COMPOSITIONS AND METHODS FOR THE MODULATION OF VIRAL 
; TITLE OF INVENTION: MATURATION 
; FILE REFERENCE: PLV-001.01 

; CURRENT APPLICATION NUMBER: US/ 1 0/0 97 , 534 

; CURRENT FILING DATE: 2002-03-12 

; PRIOR APPLICATION NUMBER: 60/275,224 

; PRIOR FILING DATE: 2001-03-12 

; PRIOR APPLICATION NUMBER: 60/308,958 

; PRIOR FILING DATE : 2001-07-31 



; PRIOR APPLICATION NUMBER: 60/340,170 
; PRIOR FILING DATE; 2001-12-07 
; NUMBER OF SEQ ID NOS : 71 
; SOFTWARE: Patent In Ver. 2.1 
; SEQ ID NO 29 

LENGTH: 1054 

TYPE : PRT 
; ORGANISM: Homo sapiens 
US-10-097-534-29 



Query Match 64 .2%; 

Best Local Similarity 62.5%; 
Matches 5; Conservative 

Qy 2 ENWWGDVC 9 

= 111 M 
Db 54 9 DNWWSQVC 556 



Score 43; DB 15; Length 1054; 
Pred. No. 3.7e+02; 
1; Mismatches 2; Indels 0; Gaps 0; 



RESULT 4 

US-10-001-887-127 

Sequence 127, Application US/10001887 
Publication No. US20020155464A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Salceda, Susana 
Macina, Roberto 
Recipon, Herve 
Cafferkey, Robert 
Sun, Yongming 
Liu, Chenghua 

TITLE OF INVENTION: Compositions and Methods Relating to Breast Specific 
Genes and Proteins 

FILE REFERENCE: DEX-0269 

CURRENT APPLICATION NUMBER: US/10/001,887 
CURRENT FILING DATE : 2001-11-20 
PRIOR APPLICATION NUMBER: 60/249,998 
PRIOR FILING DATE: 2000-11-20 
PRIOR APPLICATION NUMBER: 60/252,563 
PRIOR FILING DATE: 2000-11-22 
NUMBER OF SEQ ID NOS: 137 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 12 7 
LENGTH: 1088 
TYPE : PRT 

ORGANISM: Homo sapien 
US-10-001-887-127 

Query Match 62.7%; Score 42; DB 14; Length 1088; 

Best Local Similarity 100.0%; Pred. No. 5.1e+02; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 CENWW 5 

Mill 

Db 660 CENWW 664 



RESULT 5 



US-10-074-475-255 

Sequence 255, Application US/10074475 
Publication No. US20030092898A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Salceda, Susana 
Macina, Roberto 
Hu, Ping 
Recipon, Herve 
Karra, Kalpana 
Caf f erkey , Robert 
Sun, Yongming 
. Liu, Chenghua 

TITLE OF INVENTION: Compositions and Methods Relating to Breast Specific 
TITLE OF INVENTION: Genes and Proteins 
FILE REFERENCE: DEX-0313 

CURRENT APPLICATION NUMBER: US/l 0/ 074 , 475 
CURRENT FILING DATE: 2002-02-13 
PRIOR APPLICATION NUMBER: 60/268,292 
PRIOR FILING DATE: 2001-02-13 
NUMBER OF SEQ ID NOS : 295 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 255 
LENGTH: 1088 
TYPE : PRT 

ORGANISM: Homo sapien 
US-10-074-475-255 

Query Match 62.7%; Score 42; DB 15; Length 1088; 

Best Local Similarity 100.0%; Pred. No. 5.1e+02; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 CENWW 5 

Mill 

Db 660 CENWW 664 



RESULT 6 

US-09-738-626-3646 

Sequence 3646, Application US/09738626 
Publication No, US20020197605A1 
GENERAL INFORMATION: 
APPLICANT: NAKAGAWA, SATOSHI 
APPLICANT: MIZOGUCHI , HIROSHI 
APPLICANT: ANDO, SEIKO 
APPLICANT: HAYASHI, MIKIRO 
APPLICANT: OCHIAI , KEIKO 
APPLICANT: YOKOI , HARUHIKO 
APPLICANT: TATE I SHI , NAOKO 
APPLICANT: SENOH, AKIHIRO 
APPLICANT: IKEDA, MASATO 
APPLICANT: OZAKI , AKIO 

TITLE OF INVENTION: NOVEL POLYNUCLEOTIDES 
FILE REFERENCE: 249-125 

CURRENT APPLICATION NUMBER: US/09/738 , 626 
CURRENT FILING DATE: 2000-12-18 
PRIOR APPLICATION NUMBER: JP 99/377484 
PRIOR FILING DATE: 1999-12-16 



; PRIOR APPLICATION NUMBER: JP 00/159162 

; PRIOR FILING DATE; 2000-04-07 

; PRIOR APPLICATION NUMBER: JP 00/280988 

; PRIOR FILING DATE: 2000-08-03 

; NUMBER OF SEQ ID NOS : 7059 

; SOFTWARE: Patentln ver. 3.0 

; SEQ ID NO 3646 

LENGTH: 212 

TYPE: PRT 

ORGANISM: Corynebacterium glutamicum 
US-09-738-626-3646 

Query Match 5 9.7%; Score 40; DB 10; Length 212; 

Best Local Similarity 55.6%; Pred. No. 2.6e+02; 

Matches 5; Conservative 2; Mismatches 2; Indels 0; Gaps 0 

Qy 1 CENWWGDVC 9 

hi II =1 
Db 35 CDNTWGRLC 43 



RESULT 7 

US-10-097-340-117 

Sequence 117, Application US/10097340 
Publication No. US20030087250A1 
GENERAL INFORMATION: 
APPLICANT: John MONAHAN 
APPLICANT: Manjula GANNA VARA PU 
APPLICANT: Sebastian HOERSCH 
APPLICANT : Shubhangi KAMATKAR 
APPLICANT: Steve G. KOVATS 
APPLICANT: Rachel E. MEYERS 
APPLICANT: Michael MORRISEY 
APPLICANT: Peter OLANDT 
APPLICANT: Ami SEN 
APPLICANT: Peter VEIBY 
APPLICANT: Gordon B. MILLS 
APPLICANT: Robert C. BAST, Jr. 
APPLICANT: Karen LU 
APPLICANT: Rosemarie SCHMANDT 
APPLICANT: Xumei ZHAO 
APPLICANT: Karen GLATT 

TITLE OF INVENTION: Nucleic Acid Molecules and Proteins For The 
Identification, 

TITLE OF INVENTION: Assessment, Prevention, and Therapy of Ovarian Cancer 
FILE REFERENCE: MRI-030 

CURRENT APPLICATION NUMBER: US/1 0/ 097 , 34 0 
CURRENT FILING DATE: 2002-03-14 
PRIOR APPLICATION NUMBER: 60/276,025 
PRIOR FILING DATE: 2001-03-14 
PRIOR APPLICATION NUMBER: 60/325,149 
PRIOR FILING DATE: 2001-09-26 
PRIOR APPLICATION NUMBER: 60/276,026 
PRIOR FILING DATE: 2 001-03-14 
PRIOR APPLICATION NUMBER: 60/324,967 
PRIOR FILING DATE: 2001/09/26 
PRIOR APPLICATION NUMBER: 60/311,732 



; PRIOR FILING DATE: 2001-08-10 

; PRIOR APPLICATION NUMBER: 60/325,102 

; PRIOR FILING DATE: 2001-09-26 

; PRIOR APPLICATION NUMBER: 60/323,580 

; PRIOR FILING DATE: 2001-09-19 

; NUMBER OF SEQ ID NOS : 363 

; SOFTWARE: Fast SEQ for Windows Version 4.0 

; SEQ ID NO 117 

LENGTH: 243 

TYPE : PRT 

ORGANISM: Homo sapiens 
US-10-097-340-117 



Query Match 59.7%; Score 40; DB 15; Length 243; 

Best Local Similarity 71.4%; Pred. No. 2.9e+02 ; 

Matches 5; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 1 CENWWGD 7 

II II I 
Db 137 CERWWED 143 



RESULT 8 

US-10-203-708-41 

; Sequence 41, Application US/10203708 
; Publication No. US20030149238A1 
; GENERAL INFORMATION: 

; APPLICANT: SMITHKLINE BEE CHAM CORPORATION 
; APPLICANT: SMITHKLINE BEECHAM p. I.e. 

TITLE OF INVENTION: NOVEL COMPOUNDS 
; FILE REFERENCE: GP50013 

; CURRENT APPLICATION NUMBER: US/ 10/2 03 , 708 

; CURRENT FILING DATE: 2002-08-13 

; PRIOR APPLICATION NUMBER: PCT/US0 1/ 04703 

; PRIOR FILING DATE: 2001-02-14 

; PRIOR APPLICATION NUMBER: 60/182,172 

; PRIOR FILING DATE: 2000-02-14 

; PRIOR APPLICATION NUMBER: 60/186,084 

; PRIOR FILING DATE: 2000-02-29 

; NUMBER OF SEQ ID NOS: 46 

SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 41 

LENGTH: 250 

TYPE : PRT 
; ORGANISM: Homo sapiens 
US-10-203-708-41 



Query Match 59.7%; Score 40; DB 12; Length 250; 

Best Local Similarity 71.4%; Pred. No. 3e+02; 

Matches 5; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 1 CENWWGD 7 

II II I 

Db 136 CEEWWED 142 



RESULT 9 



US-10-097-340-111 

Sequence 111, Application US/10097340 
Publication No. US20030087250A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



John MONAHAN 
Manjula GANNAVARAPU 
Sebastian HOERSCH 
Shubhangi KAMATKAR 
Steve G. KOVATS 
Rachel E. MEYERS 
Michael MORRISEY 
Peter OLANDT 
Ami SEN 
Peter VEIBY 
Gordon B. MILLS 
Robert C. BAST, Jr. 
Karen LU 

Rosemarie SCHMANDT 
Xumei ZHAO 
Karen GLATT 

TITLE OF INVENTION: Nucleic Acid Molecules and Proteins For The 
Identification, 

TITLE OF INVENTION: Assessment, Prevention, and Therapy of Ovarian Cancer 
FILE REFERENCE: MRI-03 0 

CURRENT APPLICATION NUMBER: US/10/097 , 340 
CURRENT FILING DATE: 2 0 02-03-14 
PRIOR APPLICATION NUMBER: 60/276,025 
PRIOR FILING DATE: 2001-03-14 
PRIOR APPLICATION NUMBER: 60/325,149 
PRIOR FILING DATE: 2001-09-26 
PRIOR APPLICATION NUMBER: 60/276,026 
PRIOR FILING DATE: 2001-03-14 
PRIOR APPLICATION NUMBER: 60/324,967 
PRIOR FILING DATE: 2001/09/26 
PRIOR APPLICATION NUMBER: 60/311,732 
PRIOR FILING DATE: 2001-08-10 
PRIOR APPLICATION NUMBER: 60/325,102 
PRIOR FILING DATE: 2001-09-26 
PRIOR APPLICATION NUMBER: 60/323,580 
PRIOR FILING DATE: 2001-09-19 
NUMBER OF SEQ ID NOS : 363 
SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 111 
LENGTH: 2 57 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-097-340-111 

Query Match 59.7%; Score 40; DB 15; Length 257; 

Best Local Similarity 71.4%; Pred. No. 3e+02 ; 

Matches 5; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 CENWWGD 7 

I! II I 

Db 13 9 CEQWWED 145 



RESULT 10 
US-10-189-977-13 

; Sequence 13, Application US/10189977 

; Publication No. US2003 0166211A1 

; GENERAL INFORMATION: 

; APPLICANT: Taylor, Diane E. 

; APPLICANT: Ge, Zhongming 

; TITLE OF INVENTION: ALPHA- 1, 3 -FUCOSYLTRANF ERASE 

; FILE REFERENCE: 07254/049001 

; CURRENT APPLICATION NUMBER: US/ 10/18 9 , 977 

; CURRENT FILING DATE: 2002-07-03 

; PRIOR APPLICATION NUMBER: US/09/092 , 315 

; PRIOR FILING DATE: 1998-06-05 

; PRIOR APPLICATION NUMBER: US 60/048,857 

; PRIOR FILING DATE: 1997-06-06 

; NUMBER OF SEQ ID NOS : 22 

SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 13 

LENGTH: 372 

TYPE: PRT 
; ORGANISM: Helicobacter pylori 
US-10-189-977-13 

Query Match 59.7%; Score 40; DB 12; Length 372; 

Best Local Similarity 100.0%; Pred. No. 4.1e+02; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 3 NWWGD 7 

Mill 

Db 32 NWWGD 36 



RESULT 11 
US-10-120-319-13 

; Sequence 13, Application US/10120319 

; Publication No. US20020164749A1 

; GENERAL INFORMATION: 

; APPLICANT: Taylor, Diane E. 

; APPLICANT: Ge, Zhongming 

; TITLE OF INVENTION: ALPHA- 1 , 3 -FUCOSYLTRANFERASE 

; FILE REFERENCE: 07254/049001 

; CURRENT APPLICATION NUMBER: US/10/120 , 319 

; CURRENT FILING DATE: 2002-04-09 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 09/092,315 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-05 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: US 60/048,857 
; PRIOR FILING DATE: EARLIER FILING DATE: 1997-06-06 
; NUMBER OF SEQ ID NOS: 22 

SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 13 

LENGTH: 372 

TYPE : PRT 

ORGANISM: Helicobacter pylori 
US-10-120-319-13 

Query Match 59.7%; Score 40; DB 14; Length 372; 

Best Local Similarity 100.0%; Pred. No. 4.1e+02; 



Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 



3 NWWGD 7 



Db 



32 NWWGD 36 



RESULT 12 
US-09-733-524-16 

; Sequence 16, Application US/09733524 
; Patent No. US20020068347A1 
; GENERAL INFORMATION: 

; APPLICANT: The Governers of the University of Alberta, a Canada Corporation 

APPLICANT: Taylor, Diane E. 
; APPLICANT: Ge, Zhongming 

; TITLE OF INVENTION: NUCLEIC ACIDS ENCODING ALPHA-1,3 

; TITLE OF INVENTION: FUCOSYLTRANSF ERASES AND EXPRESSION SYSTEMS FOR MAKING 
AND 

; TITLE OF INVENTION: EXPRESSING THEM 

; FILE REFERENCE: 07254/049002 

; CURRENT APPLICATION NUMBER: US/ 09/733 , 524 

; CURRENT FILING DATE: 2000-12-14 

; PRIOR APPLICATION NUMBER: 09/092,315 

; PRIOR FILING DATE: 1998-06-05 

; PRIOR APPLICATION NUMBER: 60/048,857 

; PRIOR FILING DATE: 1997-06-06 

; NUMBER OF SEQ ID NOS : 2 0 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 16 

LENGTH: 424 

TYPE : PRT 

; ORGANISM: Helicobacter pylori fucosyl trans f erase 
FEATURE : 

NAME /KEY : PEPTIDE 
LOCATION: (0) . . . (0) 
OTHER INFORMATION: Strain 2 6695B 
US-09-733-524-16 

Query Match 5 9.7%; Score 40; DB 9; Length 424; 

Best Local Similarity 100.0%; Pred. No. 4.5e+02; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 3 NWWGD 7 



RESULT 13 
US-10-189-977-6 

; Sequence 6, Application US/10189977 
; Publication No. US20030166211A1 
; GENERAL INFORMATION: 

APPLICANT: Taylor, Diane E. 

APPLICANT: Ge, Zhongming 
; TITLE OF INVENTION: ALPHA- 1, 3 -FUCOSYLTRANF ERASE 
; FILE REFERENCE: 07254/049001 
; CURRENT APPLICATION NUMBER: US/10/189,977 



Db 




; CURRENT FILING DATE: 2002-07-03 

; PRIOR APPLICATION NUMBER: US/ 09/ 092 , 315 

; PRIOR FILING DATE: 1998-06-05 

; PRIOR APPLICATION NUMBER: US 60/048,857 

; PRIOR FILING DATE: 1997-06-06 

; NUMBER OF SEQ ID NOS : 22 

SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 6 

LENGTH: 425 

TYPE : PRT 
; ORGANISM: Helicobacter pylori 
US-10-189-977-6 

Query Match 59.7%; Score 40; DB 12; Length 425; 

Best Local Similarity 100.0%; Pred. No. 4.5e+02; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 3 NWWGD 7 

Mill 

Db 33 NWWGD 37 



RESULT 14 
US-10-392-098-6 

; Sequence 6, Application US/10392098 

; Publication No. US20030166212A1 

; GENERAL INFORMATION: 

; APPLICANT: Taylor, Diane E. 

; APPLICANT: Ge , Zhongming 

; TITLE OF INVENTION: NUCLEIC ACIDS ENCODING ALPHA-1,3 

; TITLE OF INVENTION: FUCOSYLTRANSF ERASES AND EXPRESSION SYSTEMS FOR MAKING 
AND 

; TITLE OF INVENTION: EXPRESSING THEM (amended) 

; FILE REFERENCE: 07254-049002 

; CURRENT APPLICATION NUMBER: US/ 1 0/3 92 , 098 

; CURRENT FILING DATE: 2003-03-17 

; PRIOR APPLICATION NUMBER: US/09/733 , 524A 

; PRIOR FILING DATE: 2000-12-07 

; PRIOR APPLICATION NUMBER: US 09/092,315 

; PRIOR FILING DATE: 1998-06-05 

; PRIOR APPLICATION NUMBER: US 60/048,857 

; PRIOR FILING DATE: 1997-06-06 

; NUMBER OF SEQ ID NOS: 27 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 6 

LENGTH: 425 

TYPE : PRT 

ORGANISM: Helicobacter pylori 
US-10-392-098-6 

Query Match 59.7%; Score 40; DB 12; Length 425; 

Best Local Similarity 100.0%; Pred. No. 4.5e+02; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 3 NWWGD 7 

Mill 

Db 33 NWWGD 37 



RESULT 15 
US-10-120-319-6 

; Sequence 6, Application US/10120319 

; Publication No. US20020164749A1 

; GENERAL INFORMATION: 

; APPLICANT: Taylor, Diane E. 

; APPLICANT: Ge , Zhongming 

; TITLE OF INVENTION: ALPHA- 1 , 3 -FUCOSYLTRANF ERASE 

; FILE REFERENCE: 07254/049001 

; CURRENT APPLICATION NUMBER: US/10/12 0 , 3 1 9 

; CURRENT FILING DATE: 2002-04-09 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 09/092,315 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-05 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER : US 60/048,857 
; PRIOR FILING DATE: EARLIER FILING DATE: 1997-06-06 
; NUMBER OF SEQ ID NOS : 22 

SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 6 

LENGTH: 425 

TYPE: PRT 

ORGANISM: Helicobacter pylori 
US-10-120-319-6 

Query Match 59.7%; Score 40; DB 14; Length 425; 

Best Local Similarity 100.0%; Pred. No. 4.5e+02; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 
Qy 3 NWWGD 7 



Db 



33 NWWGD 3 7 



Search completed: November 13, 2003, 09:58:27 
Job time : 18.6562 sees 



GenCore version 5.1.6 
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OM protein - protein search, using sw model 



Run on: 
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; Search time 9.375 Seconds 
(without alignments) 
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Title: 

Perfect score 
Sequence : 



US-09-228-866-2 
67 

1 CENWWGDVC 9 



Scoring table 



BLOSUM62 
Gapop 10.0 



Gapext 0 . 5 



Searched: 



283308 seqs, 96168682 residues 



Total number of hits satisfying chosen parameters: 



283308 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 
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Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
T16677 

hypothetical protein R04A9.3 - Caenorhabdit is elegans 
C; Species: Caenorhabdit is elegans 

C;Date: 20-Sep-1999 #sequence_revision 20-Sep-1999 #text_change 04-Mar-2000 
C; Accession: T16677 
R;Geisel, C. 

submitted to the EMBL Data Library, November 19 95 

A; Description: The sequence of C. elegans cosmid R04A9 . 

A;Reference number: Z18558 

A; Accession : T16677 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A;Residues: 1-267 <GEI> 

A; Cross-references : EMBL:U41550; NID : glll8045 ; PID : glll8047 ; PIDN : AAA83285 . 1 ; 

CESP:R04A9.3 

C;Genetics : 

A; Gene: CESP:R04A9.3 

A;Introns: 15/3; 44/3; 80/2; 136/3; 160/1; 197/1; 250/1 
C;Superfamily: Caenorhabditis elegans hypothetical protein R04A9.3 

Query Match 65.7%; Score 44; DB 2; Length 267; 

Best Local Similarity 62.5%; Pred. No. 14; 

Matches 5; Conservative 2; Mismatches 1; Indels 0; Gaps 0 



Qy 1 CENWWGDV 8 

h Mlh 

Db 19 CQAWWGDL 2 6 



RESULT 2 
S73550 

DNA polymerase III gamma-tau chain dnaX - Mycoplasma pneumoniae (strain ATCC 
29342) 

N; Alternate names: hypothetical protein C12_orf681 
C;Species: Mycoplasma pneumoniae 
A;Variety: ATCC 29342 

C;Date: 27-Feb-1997 #sequence_revision 25-Apr-1997 #text_change 07-Dec-1999 
C; Access ion: S73550 

R;Himmelreich, R. ; Hilbert, H. ; Plagens, H. ; Pirkl, E. ; Li, B.C.; Herrmann, R. 
Nucleic Acids Res. 24, 4420-4449, 1996 



A; Title: Complete sequence analysis of the genome of the bacterium Mycoplasma 
pneumoniae , 

A/Reference number: S73327; MUID: 97105885; PMID; 8948633 
A; Accession : S73550 

A; Status : preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A;Residues: 1-681 <HIM> 

A; Cross-references: EMBL: AE000022 ; GB:U00089; NID : gl673882 ; PIDN: AAB95872 . 1; 
PID:gl673890 

A;Note: the nucleotide sequence was submitted to the EMBL Data Library, November 
1996 

C; Genetics : 

A; Gene: dnaX 

A; Genetic code: SGC3 

C;Superfamily: DNA-directed DNA polymerase III gamma chain 

Query Match 65.7%; Score 44; DB 2; Length 681; 

Best Local Similarity 77.8%; Pred. No. 32; 

Matches 7; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 
Qy 1 CENWWGDVC 9 

Db 63 CLNWNGDVC 71 



RESULT 3 
AF2481 

hypothetical protein all7030 [imported] - Nostoc sp. (strain PCC 7120) plasmid 
pCC7120alpha 

C; Species: Nostoc sp. PCC 7120 

A;Note: Nostoc sp . strain PCC 7120 is a synonym of Anabaena sp. strain PCC 7120 
C;Date: 14-Dec-2001 #sequence_revision 14-Dec-2001 #text_change 09-Dec-2002 
C;Accession: AF2481 

R ; Kaneko, T. ; Nakamura, Y. ; Wolk, CP. ; Kuritz , T. ; Sasamoto, S.; Watanabe, A. ; 
Iriguchi, M . ; Ishikawa, A.; Kawashima, K. ; Kimura, T. ; Kishida, Y. ; Kohara, M. ; 
Matsumoto, M. ; Matsuno, A.; Muraki, A.; Nakazaki , N. ; Shimpo, S.; Sugimoto, M. ; 
Takazawa, M . ; Yamada, M. ; Yasuda, M. ; Tabata, S. 
DNA Res. 8, 205-213, 2001 

A; Title: Complete Genomic Sequence of the Filamentous Nitrogen- fixing 

Cyanobacterium Anabaena sp. strain PCC 7120. 

A;Reference number: AB1807; MUID: 21595285 ; PMID: 11759840 

A; Access ion : AF24 81 

A; Status : preliminary 

A; Molecule type: DNA 

A;Residues: 1-1132 <KUR> 

A; Cross-references : GB:BA000020; PIDN : BAB78 114 . 1 ; PID : gl7135568 ; GSPDB : GN0018 0 
A; Experimental source: strain PCC 712 0 
C; Genetics : 
A;Gene: all7030 
A; Genome: plasmid 

Query Match 65.7%; Score 44; DB 2; Length 1132; 

Best Local Similarity 62.5%; Pred. No, 51; 

Matches 5; Conservative 2; Mismatches 1; Indels 0; Gaps 0; 
Qy 1 CENWWGDV 8 



Db 1065 CDSWWGQV 1072 



RESULT 4 
B38919 

hypothetical protein 2 - human (fragment) 
C; Species: Homo sapiens (man) 

C;Date: 05-Aug-1994 #sequence_revision 05-Aug-1994 #text_change 04-Mar-2000 
C;Accession: B38919 

R; Nomura, N . ; Miyajima, N . ; Kawarabayashi , Y. ; Tabata, S. 
submitted to the EMBL Data Library, May 19 94 

A; Description: Prediction of new human genes by entire sequencing of randomly 

sampled cDNA clones. 

A; Reference number: A3 8 919 

A; Accession: B3 8919 

A;Molecule type: mRNA 

A; Residues: 1-1054 <N0M> 

A; Cross-references : EMBL:D25215 

C;Superfamily: ubiquitin-protein ligase homology 

F; 731-1049/ Doma in : ubiquitin-protein ligase homology <UBI> 

Query Match 64.2%; Score 43; DB 2; Length 1054; 

Best Local Similarity 62.5%; Pred. No. 67; 

Matches 5; Conservative 1; Mismatches 2; Indels 0; Gaps 
Qy 2 ENWWGDVC 9 

Db 54 9 DNWWSQVC 556 



RESULT 5 
A45724 

pectate lyase (EC 4.2.2.2) - fungus (Fusarium solani) 
C; Species: Fusarium solani 

C;Date: 21-Sep-1993 #sequence_revision 18-Nov-1994 #text_change 21-Jul-2000 
C; Access ion: A4 5724 

R;Gonzalez-Candelas # L. ; Kolattukudy, P.E. 
J. Bacteriol. 174, 6343-6349, 1992 

A;Title: Isolation and analysis of a novel inducible pectate lyase gene from the 
phytopathogenic fungus Fusarium solani f . sp. pisi (Nectria haematococca, mating 
population VI) . 

A;Re ference number: A45724; MUID: 93015682 ; PMID: 14 00187 
A; Accession: A45724 
A; Status : preliminary 
A;Molecule type: DNA; protein 
A; Residues: 1-242 <GON> 

A; Cross-references: GB:M94691; NID:gl68155; PIDN : AAA33338 . 1 ; PID:gl68156 
A; Experimental source: isolate T8 

A;Note: sequence extracted from NCBI backbone (NCBIN : 115473 , NCBIP : 115474 ) 
C; Keywords: carbon-oxygen lyase 

Query Match 61.2%; Score 41; DB 2; Length 242; 

Best Local Similarity 83.3%; Pred. No. 35; 

Matches 5; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 4 WWGDVC 9 

II III 



Db 



102 WWADVC 107 



RESULT 6 
T24191 

hypothetical protein R11D1.10 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 15-Oct-1999 
C; Access ion: T24191 
R; Steward, C. 

submitted to the EMBL Data Library, June 1996 
A/Reference number: Z19850 
A;ACCession: T24191 

A;Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A;Residues: 1-443 <WIL> 

A; Cross-references: EMBL:Z75547; PIDN: CAA99905 . 2 ; GSPDB : GN00023 ; CESP:R11D1. 

A; Experimental source: clone R11D1 

C;Genetics : 

A;Gene: CESP : R11D1 . 10 

A ; Map position: 5 

A;Introns: 27/3; 76/3; 135/3; 188/1; 228/3; 318/2; 348/1 

Query Match 61.2%; Score 41; DB 2; Length 443; 

Best Local Similarity 83.3%; Pred. No. 61; 

Matches 5; Conservative 1; Mismatches 0; Indels 0; Gaps 



Qy 2 ENWWGD 7 

:|ltll 

Db 91 KNWWGD 96 



RESULT 7 
A53506 

folate receptor type gamma - human 

N; Contains: folate receptor type gamma' 

C; Species: Homo sapiens (man) 

C;Date: 06-Oct-1994 #sequence_revision 18-Nov-1994 #text_change 21-Jul-2000 

C;Accession: A53506; B53506 

R;Shen, F. ; Ross, J.F.; Wang, X.; Ratnam, M . 

Biochemistry 33, 1209-1215, 1994 

A; Title: Identification of a novel folate receptor, a truncated receptor, a: 

receptor type beta in hematopoietic cells: cDNA cloning, expression, 

immunoreactivity, and tissue specificity. 

A;Reference number: A53506; MUID: 94153905; PMID: 8110752 

A; Access ion: A53506 

A; Status : preliminary 

A; Molecule type: mRNA 

A;Residues: 1-243 <SHE> 

A; Cross-references: GB:Z32564; NID:g473235; PIDM : CAA83553 . 1 ; PID:g473236 

A; Experimental source: CML patient, spleen and bone marrow 

A;Note: sequence extracted from NCBI backbone (NCBIN : 145218 , NCBIP : 145219) 

A; Access ion: B53 506 

A; Status : preliminary 

A;Molecule type: mRNA 

A;Residues: 1-104 <SH2> 

A; Cross-references: GB:Z32633; NID:g474060; PlDKT : CAA83566 . 1 ; PID:g474061 



A /Experimental source: CML patient, spleen and bone marrow 

A;Note: sequence extracted from NCBI backbone (NCBIN: 145220, NCBIP : 145221) 

C;Genetics : 

A; Gene: GDB : FOLR3 

A; Cross-references : GDB : 306562 

C; Super family : f olate-binding protein 

Query Match 59.7%; Score 40; DB 2; Length 243; 

Best Local Similarity 71.4%; Pred. No. 50; 

Matches 5; Conservative 0; Mismatches 2; Indels 0; Gaps 



Qy 1 CENWWGD 7 

II II I 

Db 137 CERWWED 143 



RESULT 8 
A45753 

f olate-binding protein precursor - human 

N; Contains: folate receptor; tumor-associated antigen 

C; Species: Homo sapiens (man) 

C;Date: 03-Jun-1993 #sequence_revision 03-Jun-1993 #text_change 13-Aug-1999 
C;Accession: A44904; B44904; A36515; A45753 ; S21763; S24405; S47554; A32864; 
A47570; A28316; B28316 

R;Coney, L.R.; Tomassetti, A.; Carayannopoulos, L. ; Frasca, V.; Kamen, B.A.; 
Colnaghi, M.I,; Zurawski Jr., V.R. 
Cancer Res. 51, 6125-6132, 1991 

A; Title: Cloning of a tumor-associated antigen: MOvl8 and M0vl9 antibodies 
recognize a f olate-binding protein. 

A;Reference number: A44904; MUID: 92034730; PMID:1840502 
A; Access ion : A44 904 
A; Molecule type: mRNA 
A; Residues: 1-257 <CON> 

A; Cross-references: GB:U20391; NID : gl483626 ; PIDN : AAB05827 . 1 ; PID:gl483627 

A; Experimental source: ovarian carcinoma cell line IGR0V1 

A;Note: sequence extracted from NCBI backbone (NCBIN : 66569 , NCBIP: 66571) 

A; Accession : B44 904 

A; Molecule type: protein 

A; Residues: 31-36, 'X' ,38-41, 'WX' ,44-47, ' X ' , 49 , ' X ' , 51 - 53 , 'XX' ,56, 'X' <C02> 
A;Note: sequence extracted from NCBI backbone (NCBIP : 66567) 
R; El wood, P.C. 

J. Biol. Chem. 264, 14893-14901, 1989 

A; Title: Molecular cloning and characterization of the human f olate-binding 

protein cDNA from placenta and malignant tissue culture (KB) cells. 

A;Reference number: A36515; MUID: 89359294 ; PMID:2768245 

A; Access ion : A3 6515 

A; Molecule type: mRNA 

A;Residues: 1-45, 'R' ,47-257 <ELW> 

A; Cross-references : GB : J05013 

A; Experimental source: nasopharyngeal epidermoid carcinoma cell line KB 
A; Note: the authors translated the codon AGG for residue 46 as Lys 
R;Lacey, S.W.; Sanders, J.M.; Rothberg, K.G.; Anderson, R.G.W.; Kamen, B.A. 
J. Clin. Invest. 84, 715-720, 1989 

A; Title: Complementary DNA for the folate binding protein correctly predicts 
anchoring to the membrane by glycosyl -phosphatidyl inositol . 
A;Reference number: A45753; MUID: 89340896 ; PMID:2527252 
A /Accession: A45753 



A; Status : prel iminary 
A /Molecule type: mRNA 
A;ResidueS: 1-257 <LAC> 

A; Cross-references: GB:M28099; NID:gl82415; PIDN : AAA35822 . 1 ; PID:gl82416 
R ; Sadas ivan , E . ; Cedeno , M. ; Rothenberg , S.P. 
Biochim. Biophys . Acta 1131, 91-94, 1992 

A; Title: Genomic organization of the gene and a related pseudogene for a human 
folate binding protein. 

A/Reference number: S21763; MUID: 92256496 ; PMID: 1581364 
A; Accession : S21763 
A; Status : prel iminary 
A /Molecule type: DNA 
A; Residues: 1-257 <SAD> 

R; Campbell, I.G.; Jones, T.A.; Foulkes, W.D.; Trowsdale, J. 
Cancer Res, 51, 5329-5338, 1991 

A; Title: Folate-binding protein is a marker for ovarian cancer. 

A; Reference number: S24405; MUID: 92005454 ; PMID: 1717147 

A; Accession : S244 05 

A; Status : preliminary 

A /Molecule type: mRNA 

A/Residues: 1-257 <CAM> 

A; Cross-references : EMBL:X62753; NID:g28428 ; PIDN: CAA44610 . 1 ; PID:g28429 
R; Prasad, P.D.; Ramamoorthy, S.; Moe, A.J. / Smith, C.H. ; Leibach, F.H.; 
Ganapathy, V. 

Biochim. Biophys. Acta 1223, 71-75, 1994 

A/Title: Selective expression of the high-affinity isoform of the folate 
receptor (FR-alpha) in the human placental syncytiotrophoblast and 
choriocarcinoma cells. 

A /Reference number: S47554; MUID: 94339186 ; PMID:8061055 

A /Accession: S47554 

A; Status : preliminary 

A; Molecule type: mRNA 

A/Residues : 1-105,162-257 <PRA> 

R; Sadas ivan, E. / Rothenberg, S.P. 

J. Biol. Chem. 264, 5806-5811, 1989 

A; Title: The complete amino acid sequence of a human folate binding protein from 
KB cells determined from the cDNA. 

A/Reference number: A32864/ MUID: 89174638 ; PMID: 2538429 

A /Access ion: A32 8 64 

A; Status : prel iminary 

A /Molecule type: mRNA 

A/Residues: 24 - 183 S 185-249 <SA3> 

A/ Cross-references : GB:M25317; NID:gl82421; PIDN : AAA74896 . 1 ; PID:gl82422 

R/Sadasivan, E./ Rothenberg, S.P. 

Proc. Soc. Exp. Biol. Med. 189, 240-244, 1988 

A; Title: Molecular cloning of the complementary DNA for a human folate binding 
protein (42804) . 

A/Reference number: A47570; MUID: 89057954 / PMID: 3194438 
A/Accession: A47570 
A; Status : prel iminary 
A /Molecule type: mRNA 
A/Residues: 24-45 <SA2> 

A/ Cross-references : EMBL:M35069/ NID:gl82419; PIDN :AAA3 5824 . 1 / PID:gl82420 
R;Luhrs, C.A.; Pitiranggon, P.; da Costa, M./ Rothenberg, S.P./ Slomiany, B.L.; 
Brink, L. / Tous , G.I./ Stein, S. 

Proc. Natl. Acad. Sci. U.S.A. 84, 6546-6549, 1987 



A; Title: Purified membrane and soluble folate binding proteins from cultured KB 
cells have similar amino acid compositions and molecular weights but differ in 
fatty acid acylation. 

A;Reference number: A28316; MUID : 87317689; PMID : 3476960 

A /Accession: A2 8316 

A; Status : prel iminary 

A;Molecule type: protein 

A;Residues: 26-36, 'X' ,38-43 <LUH> 

A; Experimental source: KB cells 

C;Genetics : 

A; Gene: GDB : FOLR1 ; FOLR 

A; Cross -references : GDB: 128061; OMIM: 136430 

A;Map position: llql3 . 3-llql4 . 1 

C; Superf amily : f olate-binding protein 

F; 1-25/Domain: signal sequence #status predicted <SIG> 

F ; 2 6 -25 7 /Product : f olate-binding protein #status predicted <MAT> 

F; 31-257/ Product : tumor-associated antigen #status experimental <ANT> 

Query Match 5 9.7%; Score 40; DB 2; Length 257; 

Best Local Similarity 71.4%; Pred. No. 53; 

Matches 5; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 CENWWGD 7 

II II I 

Db 139 CEQWWED 14 5 



RESULT 9 
C64567 

fucosyltransf erase - Helicobacter pylori (strain 26695) 
C; Species: Helicobacter pylori 

C;Date: 09-Aug-1997 #sequence_revision 09-Aug-1997 #text_change 08-Oct-1999 
C; Access ion: C64567 

R;Tomb, J.F.; White, 0.; Kerlavage, A.R.; Clayton, R.A. ; Sutton, G.G.; 
Fleischmann, R.D.; Ketchum, K.A. ; Klenk, H.P.; Gill, S.; Dougherty, B.A.; 
Nelson, K. ; Quackenbush, J.; Zhou, L. ; Kirkness, E.F.; Peterson, S.; Loftus, B.; 
Richardson, D.; Dodson, R.; Khalak, H.G.; Glodek, A.; McKenney, K. ; Fitzegerald, 
L.M.; Lee, N. ; Adams, M.D.; Hickey, E.K.; Berg, D.E.; Gocayne, J.D.; Utterback, 
T.R.; Peterson, J.D.; Kelley, J.M.; Cotton, M.D.; Weidman, J.M. ; Fujii, C. ; 
Bowman, C; Watthey, L. 
Nature 388, 539-547, 1997 

A;Authors: Wallin, E . ; Hayes, W.S.; Borodovsky, M. ; Karpk, P.D.; Smith, H.O.; 
Fraser, CM.; Venter, J.C. 

A; Title: The complete genome sequence of the gastric pathogen Helicobacter 
pylori . 

A;Reference number: A64520; MUID : 97394467 ; PMID : 92 5218 5 
A; Access ion: C64 567 

A; Status : preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A; Residues: 1-425 <TOM> 

A; Cross-references: GB:AE000554; GB:AE000511; NID : g2313475 ; PIDN : AAD07447 . 1 ; 
PID:g2313482; TIGR:HP0379 

Query Match 59.7%; Score 40; DB 2; Length 425; 

Best Local Similarity 100.0%; Pred. No. 83; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 



3 NWWGD 7 



Db 



I I I I 
33 NWWG 




RESULT 10 
C64601 

fucosyl trans f erase - Helicobacter pylori (strain 26695) 
C; Species: Helicobacter pylori 

C;Date: 09-Aug-1997 #sequence_revision 09~Aug-1997 #text_change 08-Oct-1999 
C; Access ion: C64601 

R;Tomb, J.F.; White, 0.; Kerlavage, A.R.; Clayton, R.A. ; Sutton, G.G. ; 
Fleischmann, R.D.; Ketchum, K.A. ; Klenk, H.P.; Gill, S.; Dougherty, B.A.; 
Nelson, K. ; Quackenbush, J.; Zhou, L. ; Kirkness, E.F.; Peterson, S.; Loftus, B.; 
Richardson, D.; Dodson, R . ; Khalak, H.G.; Glodek, A.; McKenney, K. ; Fitzegerald, 
L.M.; Lee, N . ; Adams, M.D.; Hickey, E.K.; Berg, D.E.; Gocayne, J.D.; Utterback, 
T.R.; Peterson, J.D.; Kelley, J.M.; Cotton, M.D.; Weidman, J.M.; Fujii, C. ; 
Bowman, C. ; Watthey, L . 
Nature 388, 539-547, 1997 

A;Authors: Wallin, E. ; Hayes, W.S.; Borodovsky, M. ; Karpk, P.D.; Smith, H,0.; 
Fraser, CM. ; Venter, J.C. 

A; Title: The complete genome sequence of the gastric pathogen Helicobacter 
pylori . 

A;Reference number: A64520; MUID : 97394467 ; PMID: 9252185 
A; Access ion: C64601 

A; Status: preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A; Residues: 1-476 <T0M> 

A; Cross-references: GB:AE000578; GB:AE000511; NID : g23 13759 ; PIDN : AAD07710 . 1 ; 
PID:g2313769; TIGR:HP0651 

Query Match 59.7%; Score 40; DB 2; Length 476; 

Best Local Similarity 100.0%; Pred. No. 92; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 3 NWWGD 7 



RESULT 11 
D72075 

hypothetical protein - Chlamydophila pneumoniae (strain CWL029) 
C;Species: Chlamydophila pneumoniae, Chlamydia pneumoniae 

C;Date: 23-Apr-1999 #sequence_jrevision 23-Apr-1999 #text_change 02-Sep-2000 
C; Access ion: D72075 

R;Kalman, S.; Mitchell, W. ; Marathe, R. ; Lammel, C; Fan, J.; Olinger, L . ; 
Grimwood, J.; Davis, R.W. ; Stephens, R.S. 
Nature Genet. 21, 385-389, 1999 

A;Title: Comparative genomes of Clamydia pneumoniae and C. trachomatis. 

A;Reference number: A72000; MUID : 99206606; PMID : 101923 88 

A; Access ion: D72 075 

A; Status : preliminary 

A;Molecule type: DNA 

A; Residues: 1-582 <ARN> 

A; Cross-references : GB:AE001630; GB:AE001363; NID : g4376740 ; PIDN : AAD18599 . 1 ; 
PID:g4376741 



Db 



33 NWWGD 3 




A; Experimental source: strain CWL02 9 
C;Genetics : 
A;Gene: CPn0457 

C; Super family: Chlamydia hypothetical protein CPn0462 

Query Match 59.7%; Score 40; DB 2; Length 582; 

Best Local Similarity 66.7%; Pred. No. l.le+02; 

Matches 4; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 CENWWG 6 

|::|ll 

Db 426 CDSWWG 431 



RESULT 12 
E86547 

hypothetical protein CPj0457 [imported] - Chlamydophila pneumoniae (strain J138) 
C; Species: Chlamydophila pneumoniae, Chlamydia pneumoniae 

C;Date: 02-Mar-2001 #sequence__revision 02-Mar-2001 #text_change 23-Mar-2001 
C; Access ion: E86547 

R;Shirai, M. ; Hirakawa, H. ; Kimoto, M. ; Tabuchi, M.; Kishi, F. ; Ouchi , K. ; 
Shiba, T.; Ishii, K. ; Hattori, M.; Kuhara, S.; Nakazawa, T. 
Nucleic Acids Res. 28, 2311-2314, 2000 

A;Title: Comparison of whole genome sequences of chlamydia pneumoniae J138. 

A; Reference number: A86491; MUID:20330349; PMID : 10871362 

A; Accession : E86547 

A; Status : preliminary 

A; Molecule type: DNA 

A;Residues: 1-629 <STO> 

A; Cross-references : GB:BA000008; NID :g8978827 ; PIDN: BAA98663 . 1 ; GSPDB : GN00142 

A; Experimental source: strain J138 

C; Genetics : 

A; Gene: CPj04 57 

C; Superfamily : Chlamydia hypothetical protein CPn0462 

Query Match 59.7%; Score 40; DB 2; Length 629; 

Best Local Similarity 66.7%; Pred. No. 1.2e+02; 

Matches 4; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 CENWWG 6 

i-lll 

Db 426 CDSWWG 431 



RESULT 13 
G81592 

hypothetical protein CP0295 [imported] - Chlamydophila pneumoniae (strain AR39) 
C; Species: Chlamydophila pneumoniae, Chlamydia pneumoniae 

C;Date: 31-Mar-2000 #sequence_revision 31-Mar-2000 #text_change 02-Sep-2000 
C; Access ion: G81592 

R;Read, T.D. ; Brunham, R.C.; Shen, C. ; Gill, S.R.; Heidelberg, J.F.; White, 0. ; 
Hickey, E.K.; Peterson, J.; Utterback, T.; Berry, K. ; Bass, S.; Linher, K. ; 
Weidman, J.; Khouri, H. ; Craven, B.; Bowman, C. ; Dodson, R. ; Gwinn, M . ; Nelson, 
W.; DeBoy, R. ; Kolonay, J.; McClarty, G. ; Salzberg, S.L.; Eisen, J . ; Fraser, 
CM. 

Nucleic Acids Res. 28, 1397-1406, 2000 



A; Title: Genome sequences of Chlamydia trachomatis MoPn and Chlamydia pneumoniae 
AR3 9. 

A;Reference number: A81500; MUID: 20150255 ; PMID: 10684935 
A; Access ion: G81592 
A; Status : preliminary 
A; Molecule type: DNA 
A;Residues: 1-629 <REA> 

A; Cross-references: GB:AE002191; GB:AE0O2161; NID:g7189216; PIDN : AAF38152 . 1 ; 

PID:g7189221; GSPDB : GN00122 ; TIGR:CP0295 

A; Experimental source: strain AR3 9, HL cells 

C; Genetics : 

A;Gene: CP0295 

C; Superfamily : Chlamydia hypothetical protein CPn04 62 

Query Match 59.7%; Score 40; DB 2; Length 629; 

Best Local Similarity 66.7%; Pred. No. 1.2e+02; 

Matches 4; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CENWWG 6 

Db 426 CDSWWG 431 



RESULT 14 
T40823 

probable para-aminobenzoate synthase - fission yeast (Schizosaccharomyces pombe) 
C;Species: Schizosaccharomyces pombe 

C;Date: 03-Dec-1999 #sequence_revision 03-Dec-1999 #text_change 31-Jan-2000 
C; Access ion: T4 0823 

R;Beck, A.; Reinhardt, R. ; Lyne, M. ; Rajandream, M.A. ; Barrel 1, B.G. 
submitted to the EMBL Data Library, October 19 98 
A; Reference number: Z2194 9 
A; Accession: T4 0823 

A; Status : preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A;Residues: 1-718 <BEC> 

A; Cross-references : EMBL:AL032684 ; PIDN : CAA2 18 14.1; GSPDB :GN00067 ; 
SPDB:SPBP8B7.29 

A; Experimental source: strain 972h-; clone pi p8B7 
C; Genetics : 

A; Gene: SPDB : SPBP8B7 . 29 
A; Map position: 2 

C; Superfamily : yeast p-aminobenzoate synthase; trpG homology 

Query Match 5 9.7%; Score 40; DB 2; Length 718; 

Best Local Similarity 50.0%; Pred. No. 1.3e+02; 

Matches 4; Conservative 2; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 CENWWGDV 8 

I III:: 
Db 4 04 CSEWWGEL 411 



RESULT 15 
T30553 

disease resistance protein Hcr2-5D - tomato 
C;Species: Lycopersicon esculentum (tomato) 



C;Date: 22-Oct-1999 #sequence_revision 22-Oct-1999 #text__change ll-May-2000 
C; Access ion: T3 0553 

R;Dixon, M.S.; Hatzixanthis , K. ; Jones, D . A . ; Harrison, K. ; Jones, J.D.G. 
Plant Cell 10, 1915-1926, 1998 

A;Title: The tomato Cf-5 disease resistance gene and six homologues show 
pronounced allelic variation in leucine-rich repeat copy number. 
A;Reference number: 220855; MUID: 99030197 ; PMID: 9811798 
A; Access ion: T3 0553 

A;Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A;Residues: 1-1016 <DIX> 

A; Cross-references : EMBL : AF053 998 ; NID:g3894392 ; PID : g3 8 943 93 ; PIDN : AAC78 596 . 1 
C;Genetics : 
A;Note: Hcr2-5D 



Query Match 5 9.7%; 

Best Local Similarity 55.6%; 
Matches 5; Conservative 



Score 40; DB 2; Length 1016; 
Pred. No. 1.8e+02; 
3; Mismatches 1; Indels 



0 ; Gaps 



0; 



Qy 

Db 



1 CENWWGDVC 9 

|::|:| I I 
59 CKDWYGWC 67 



Search completed: November 13, 2003, 09:52:50 
Job time : 10.375 sees 

GenCore version 5.1.6 
Copyright (c) 1993 - 2 003 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



November 13, 2003, 09:31:40 ; Search time 5.15625 Seconds 

(without alignments) 
82.083 Million cell updates/sec 

US-09-228-866-2 
67 

1 CENWWGDVC 9 
BLOSUM62 

Gapop 10.0 , Gapext 0.5 
127863 seqs, 47026705 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 10 0% 
Listing first 45 summaries 



127863 



Database 



SwissProt 41:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 
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mycoplasma 
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1 


HER3 HUMAN 


Q15034 
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3 


42 


62. 


. 7 


1151 


1 
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mus musculu 
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homo sapien 
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. 7 
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1 
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homo sapien 
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ANL3 MOUSE 
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mus musculu 
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1 
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tribolium c 
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58. 


.2 


500 
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.2 
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1 


ENV HV1EL 
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human immun 
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.2 
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ADP1 YEAST 
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12 


39 


58 . 


,2 
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haemophilus 
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CYA8_RAT 


P40146 


rattus norv 
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.7 


1249 


1 


CYA8_MOUSE 


P97490 


mus musculu 


16 


38 
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.1 


1251 


1 


CYA8 HUMAN 


P40145 


homo sapien 
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.2 
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1 


XG_HUMAN 


P55808 


homo sapien 


18 


37 


55. 


.2 
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1 


FOL1 MOUSE 


P35846 


mus musculu 


19 


37 


55 , 


.2 


255 


1 


FOL2_HUMAN 


P14207 


homo sapien 


20 


37 


55, 


.2 


296 


1 


STC2__MOUSE 


088452 


mus musculu 


21 


37 


55, 


.2 


296 


1 


STC2_RAT 


Q9r0k8 


rattus norv 
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37 


55. 


.2 


302 


1 


STC2JHUMAN 
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homo sapien 


23 


37 


55, 


.2 
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macaca neme 
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escherichia 
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.2 
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human papil 
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,2 
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bos taurus 
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.2 
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1 


MATE_MOUSE 


Q9rlm5 


mus musculu 
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37 


55, 


.2 
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1 


SUIS HUMAN 
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homo sapien 
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37 


55, 


.2 
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1 


CLR1_HUMAN 


Q9nyq6 


homo sapien 
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37 


55, 


.2 
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CLR1_M0USE 


035161 


mus musculu 
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homo sapien 
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homo sapien 
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mus musculu 
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homo sapien 
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drosophila 
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AX1 BETVU 
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beta vulgar 
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cavia porce 
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COX3_ASTPE 
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. 7 


376 


1 
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arabidopsis 
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1 


FTSW_MESVI 
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mesostigma 


41 


36 


53, 


.7 


417 


1 


TNAB PROVU 


P28785 


proteus vul 


42 


36 


53, 


. 7 


465 


1 


LIPP CAVPO 


P50903 


cavia porce 


43 


36 


53, 


. 7 


465 


1 


LIPP_RAT 


P27657 


rattus norv 


44 


36 


53, 


.7 
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1 


TYRO_ASP0R 


Q00234 


aspergillus 


45 


36 


53, 


.7 


605 


1 


SYA TREPA 


083980 


treponema p 



ALIGNMENTS 



RESULT 1 
DP3X MYCPN 



ID DP3X_MYCPN STANDARD; PRT; 681 AA. 

AC P75177; 

DT 01-NOV-1997 (Rel . 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE DNA polymerase III subunit gamma/tau (EC 2.7.7.7). 

GN DNAX OR MPN618 OR MP224 . 

OS Mycoplasma pneumoniae. 

OC Bacteria; Firmicutes; Mollicutes ; Mycoplasmataceae; Mycoplasma. 

OX NCBI_TaxID=2 104 ; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC STRAIN-ATCC 29342 / M129; 

RX MEDLINE=97105885; PubMed-8948 63 3 ; 

RA Himmelreich R. , Hilbert H . , Plagens H. , Pirkl E., Li B.-C, 

RA Herrmann R . ; 

RT "Complete sequence analysis of the genome of the bacterium Mycoplasma 

RT pneumoniae . " ; 

RL Nucleic Acids Res. 24:4420-4449(1996). 

CC -!- FUNCTION: DNA POLYMERASE III IS A COMPLEX, MULTICHAIN ENZYME 

CC RESPONSIBLE FOR MOST OF THE REPLICATIVE SYNTHESIS IN BACTERIA . 

CC THIS DNA POLYMERASE ALSO EXHIBITS 3' TO 5' EXONUCLEASE ACTIVITY. 

CC -!- CATALYTIC ACTIVITY: N deoxynucleoside triphosphate = N diphosphate 

CC + {DNA} (N) . 

CC -I- SUBUNIT: DNA polymerase III contains a core (composed of alpha, 

CC epsilon and theta chains) that associates with a tau subunit. This 

CC core dimerizes to form the POLIII 1 complex. PolIII' associates 

CC with the gamma complex (composed of gamma, delta, delta' , psi and 

CC chi chains) and with the beta chain to form the complete DNA 

CC polymerase III complex (By similarity) . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AE000022; AAB95872.1; -. 

DR PIR; S73550; S73550. 

DR InterPro; IPR003593; AAA_ATPase . 

DR InterPro; IPR003959; AAA_ATPase_centr . 

DR InterPro; I PRO 008 62; RFCdomain. 

DR Pfam; PF00004; AAA; 1. 

DR SMART; SM00382; AAA; 1. 

KW Transferase; DNA-directed DNA polymerase; DNA replication; 

KW ATP-binding; Complete proteome. 

FT NP__BIND 44 51 ATP (POTENTIAL) . 

SQ SEQUENCE 681 AA; 76212 MW; E3DDC6A58 OFFCBCC CRC64; 



Query Match 65.7%; Score 44; DB 1; Length 681; 

Best Local Similarity 77.8%; Pred. No. 11; 



Matches 



7; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 1 CENWWGDVC 9 

Db 63 CLNWNGDVC 71 

RESULT 2 
HER3_HUMAN 

ID HER3_HUMAN STANDARD; PRT; 105 0 AA . 

AC Q15034; 

DT 01-NOV-1997 (Rel . 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE HECT domain and RCCl-like domain protein 3. 

GN HERC3 OR KIAA0032. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCB I JTax I D= 9 6 0 6 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Bone marrow; 

RX MEDLINE=96051387; PubMed=7584 026 ; 

RA Nomura N. , Miya j ima N., Sazuka T. , Tanaka A., Kawarabayasi Y . , 

RA Sato S., Nagase T., Seki N . , Ishikawa K.-I., Tabata S.; 

RT "Prediction of the coding sequences of unidentified human genes. I. 

RT The coding sequences of 40 new genes (KIAA0 0 01-KIAA004 0) deduced by 

RT analysis of randomly sampled cDNA clones from human immature myeloid 

RT cell line KG-1 . " ; 

RL DNARes. 1:27-35(1994). 

RN [2] 

R P CHARACTER I ZAT I ON . 

RX MEDLINE=21099818 ; PubMed=11163799 ; 

RA Cruz C, Ventura F. , Bartrons R. , Rosa J.L.; 

RT "HERC3 binding to and regulation by ubiquitin."; 

RL FEBS Lett. 488:74-80(2001). 

CC -!- SUBCELLULAR LOCATION: Cytoplasmic. Also found in vesicular-like 
CC structures . 

CC -!- PTM: Substrate of ubiquitination and is degraded by the 
CC proteasome . 

CC -!- SIMILARITY: Contains 1 HECT-type E3 ubiquit in-protein ligase 
CC domain. 

CC -!- SIMILARITY: Contains 7 RCC1 repeats. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformat ics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; D25215; BAA04945.1; -. 

DR Genew; HGNC:4876; HERC3 . 

DR MIM; 605200; - . 

DR InterPro; IPR000569; HECT_domain. 



DR 


InterPro 


; IPR000408; Reg_ 


chr 


_condens . 


DR 


Pfam; PF00632; HECT; 1. 






DR 


Pfam; PF00415; RCC1 ; 4. 






DR 


PRINTS; 


PR00633; 


RCCNDNSATI ON . 


DR 


SMART; SM00119; HECTC ; 1. 






DR 


PROSITE; 


PS50237; 


HECT; 1 






DR 


PROSITE; 


PS00625; 


RCC1 1; 


FALSE_NEG . 


DR 


PROSITE; 


PS00626; 


RCC1_2 ; 


4 . 




DR 


PROSITE; 


PS50012; 


RCC1 3; 


7. 




KW 


Ubl conjugation pathway; 


Ubl 


conjugation; Repeat. 


FT 


REPEAT 


1 


51 




RCC1 1. 


FT 


REPEAT 


52 


101 




RCC1 2. 


FT 


REPEAT 


102 


154 




RCC1 3. 


FT 


REPEAT 


156 


207 




RCC1 4. 


FT 


REPEAT 


208 


259 




RCC1 5. 


FT 


REPEAT 


261 


311 




RCC1 6. 


FT 


REPEAT 


313 


366 




RCC1 7. 


FT 


DOMAIN 


951 


1050 




HECT. 


FT 


BINDING 


1018 


1018 




UBIQUITIN (BY SIMILARITY) . 


SQ 


SEQUENCE 


1050 AA; 117188 ] 


MW; 5F08A1DE1F40B912 CRC64 , 



Query Match 64.2%; Score 43; DB 1; Length 1050; 

Best Local Similarity 62.5%; Pred. No. 25; 

Matches 5; Conservative 1; Mismatches 2; Indels 0; Gaps 



Qy 2 ENWWGDVC 9 

:||| II 
Db 545 DNWWSQVC 552 



RESULT 3 
XP04_HUMAN 

ID XP04_HUMAN STANDARD; PRT; 1151 AA . 

AC Q9C0E2; Q9H934; 

DT 28-FEB-2003 (Rel . 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE Export in 4 (Exp4) . 

GN XP04 OR KIAA1721. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

RX MEDLINE=21082932; PubMed=l 12 14 970 ; 

RA Nagase T. , Kikuno R . , Hattori A., Kondo Y. , Okumura K. , Ohara O. ; 

RT "Prediction of the coding sequences of unidentified human genes. XIX. 

RT The complete sequences of 100 new cDNA clones from brain which code 

RT for large proteins in vitro."; 

RL DNA Res. 7:347-355(2000). 

RN [2] 

RP SEQUENCE OF 337-1151 FROM N.A. 

RA Isogai T. , Ota T. , Hayashi K. , Sugiyama T. , Otsuki T., Suzuki Y. , 

RA Nishikawa T., Nagai K. , Sugano S., Ishibashi T., Fujimori K. , 

RA Tanai H. , Kimata M. , Watanabe M. , Hiraoka S., Ishii S. , Kawai Y. , 



RA Saito K. , Yamamoto J., Wakamatsu A., Nakamura Y., Nagahari K. , 

RA Masuho Y. , Kanehori K. ; 

RT "NEDO human cDNA sequencing project. "; 

RL Submitted (AUG-2000) to the EMBL/ GenBank/DDB J databases. 

CC -!- FUNCTION: Mediates nuclear export of eIF-5A (eukaryotic 

CC translation initiation factor 5A) and possibly that of other 

CC cargoes (By similarity) . 

CC -!- SUBUNIT: Binds to GTP-bound form of Ran (By similarity). 

CC -!- SUBCELLULAR LOCATION: Nuclear; once bound to eIF-5A and Ran 
CC the complex translocates to the cytoplasm (By similarity) . 

CC -!- SIMILARITY: BELONGS TO THE EXPORTIN FAMILY. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its' 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to 1 icense@isb-sib . ch) . 

CC 

DR EMBL; AB051508; BAB21812.1; 

DR EMBL; AK023108; BAB14409.1; ALTJCNIT. 

DR Genew; HGNC : 17796 ; XP04 . 

KW Nuclear protein; Transport; Protein transport. 

FT CONFLICT 511 511 L -> S (IN REF. 2). 

SQ SEQUENCE 1151 AA; 130139 MW; 38E7EEFC938B07C5 CRC64; 

Query Match 62.7%; Score 42; DB 1; Length 1151; 

Best Local Similarity 100.0%; Pred. No. 38; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CENWW 5 

Mill 

Db 723 CENWW 727 



RESULT 4 
XP04_M0USE 

ID XP04JVIOUSE STANDARD; PRT; 1151 AA. 

AC Q9ESJ0; 

DT 28-FEB-2003 (Rel . 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Export in 4 (Exp4) . 

GN XP04 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. , AND PARTIAL SEQUENCE. 

RX MEDLINE=20402342; PubMed-1 0944 1 19 ; 

RA Lipowsky G., Bischoff F.R., Schwarzmaier P., Kraft R. , Kostka S., 

RA Hartmann E., Kutay U. , Goerlich D.; 

RT "Export in 4: a mediator of a novel nuclear export pathway in higher 

RT eukaryotes . " ; 

RL EMBO J. 19:4362-4371(2000). 



CC -!- FUNCTION: Mediates nuclear export of eIF-5A (eukaryotic 

CC translation initiation factor 5A) and possibly that of other 

CC cargoes . 

CC -!- SUBUNIT: Binds to GTP-bound form of Ran. 

CC -!- SUBCELLULAR LOCATION: Nuclear; once bound to eIF-5A and Ran 
CC the complex translocates to the cytoplasm. 

CC -!- SIMILARITY: BELONGS TO THE EXPORTIN FAMILY. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF145021; AAG09133.1; -. 

DR MGD; MGI : 1888526 ; Xpo4 . 

KW Nuclear protein; Transport; Protein transport. 

SQ SEQUENCE 1151 AA; 129964 MW; 5836A4 94 0EB598BE CRC64 ; 

Query Match 62.7%; Score 42; DB 1; Length 1151; 

Best Local Similarity 100.0%; Pred . No. 38; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CENWW 5 

Mill 

Db 723 CENWW 727 



RESULT 5 
F0L3_HUMAN 

ID FOL3JMJMAN STANDARD; PRT; 243 AA. 

AC P41439; 

DT 01-NOV-1995 (Rel . 32, Created) 

DT 01-NOV-1995 (Rel. 32, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Folate receptor gamma precursor (FR-gamma) (Folate receptor 3) . 

GN FOLR3 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=:9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Hematopoietic; 

RX MEDLINE=94153905; PubMed=8 1 10752 ; 

RA Shen F. , Ross J.F., Wang X., Ratnam M. ; 

RT "Identification of a novel folate receptor, a truncated receptor, and 

RT receptor type beta in hematopoietic cells: cDNA cloning, expression, 

RT immunoreactivity, and tissue specificity."; 

RL Biochemistry 33:1209-1215(1994). 

RN [2] 

RP CHARACTERIZATION. 

RX MEDLINE=95244494; PubMed=7727426 ; 

RA Shen F., Wu M. , Ross J.F., Miller D . , Ratnam M. ; 

RT "Folate receptor type gamma is primarily a secretory protein due to 



RT lack of an efficient signal for glycosylphosphat idyl inositol 

RT modification: protein characterization and cell type specificity."; 

RL Biochemistry 34:5660-5665(1995). 

CC -!- FUNCTION: BINDS TO FOLATE AND REDUCED FOLIC ACID DERIVATIVES AND 
CC MEDIATES DELIVERY OF 5 -METHYLTETRAHYDROFOLATE TO THE INTERIOR OF 

CC CELLS . 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- ALTERNATIVE PRODUCTS : 

CC Event =Alternative splicing; Named isof orms=2 ; 

CC Name = Long; 

CC IsoId=P41439-l; Sequence=Displayed; 

CC Name=Short ; 

CC IsoId=P41439-2; Sequence=VSP_0015 06 ; 

CC -!- TISSUE SPECIFICITY: SPLEEN, THYMUS , BONE MARROW , OVARIAN 
CC CARCINOMA, AND UTERINE CARCINOMA. 

CC -!- PTM: EIGHT DISULFIDE BONDS ARE PRESENT (PROBABLE). 

CC -!- SIMILARITY: BELONGS TO THE FOLATE RECEPTOR FAMILY. 

CC 

CC This SWISS-PROT entry is copyright/ It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; Z32564; CAA83553.1; -. 

DR EMBL; Z32633; CAA83566.1; 

DR EMBL; U08471; AAA18382.1; 

DR EMBL; U08470; AAA18381.1; -. 

DR PIR; A53506; A53506. 

DR Genew; HGNC:3795; FOLR3 . 

DR MIM; 602469; -. 

DR GO; GO: 0005624; C: membrane fraction; TAS . 

DR GO; GO: 0005542; F:folic acid binding activity; TAS. 

DR GO; GO: 0015025; F : GPI -anchored membrane -bound receptor; TAS. 

DR GO; GO: 0015884; P: folate transport; TAS. 

DR InterPro; IPR004269; Folate_rec. 

DR Pfam; PF03024; Folate_rec; 1. 

KW Receptor; Glycoprotein; Signal; Folate-binding; Multigene family; 

KW Alternative splicing. 

FT SIGNAL 1 23 POTENTIAL. 

FT CHAIN 24 243 FOLATE RECEPTOR GAMMA. 

FT CARBOHYD 119 119 N- LINKED ( GLCNAC . . .) (POTENTIAL). 

FT CARBOHYD 159 159 N-LINKED (GLCNAC. . .) (POTENTIAL). 

FT CARBOHYD 199 199 N-LINKED ( GLCNAC . . .) (POTENTIAL). 

FT VARSPLIC 105 243 Missing (in isoform Short) . 

FT /FTId=VSP__001506 . 

SQ SEQUENCE 243 AA; 27638 MW; AC7636EB5355647B CRC64 ; 

Query Match 59.7%; Score 40; DB 1; Length 243; 

Best Local Similarity 71.4%; Pred. No. 17; 

Matches 5; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 CENWWGD 7 

II II I 

Db 137 CERWWED 143 



RESULT 6 
F0L1_HUMAN 

ID F0L1_HUMAN STANDARD; PRT; 257 AA. 

AC P15328; 

DT 01-APR-1990 (Rel . 14, Created) 

DT 01-JUN-1994 (Rel. 29, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE Folate receptor alpha precursor (FR-alpha) (Folate receptor 1) (Folate 

DE receptor, adult) (Adult folate-binding protein) (FBP) (Ovarian tumor- 

DE associated antigen M0vl8) (KB cells FBP) . 

GN F0LR1 OR FOLR. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=89340896; PubMed-2527252 ; 

RA Lacey S.W., Sanders J.M. , Rothberg K.G. , Anderson R.G.W., 

RA Kamen B . A . ; 

RT "Complementary DNA for the folate binding protein correctly predicts 

RT anchoring to the membrane by glycosyl -phosphat idylinositol . " ; 

RL J. Clin. Invest. 84:715-720(1989). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=89359294; PubMed=276824 5 ; 

RA Elwood P.C. ; 

RT "Molecular cloning and characterization of the human f olate-binding 

RT protein cDNA from placenta and malignant tissue culture (KB) cells. "; 

RL J. Biol. Chem. 264:148 93-14 901(198 9). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Ovary ; 

RX MEDLINE=92005454; PubMed=17 1714 7 ; 

RA Campbell I.G., Jones T.A., Foulkes W.D., Trowsdale J. ; 

RT "Folate-binding protein is a marker for ovarian cancer."; 

RL Cancer Res. 51:5329-5338(1991). 

RN [4] 

RP SEQUENCE FROM N.A. 

RC TISSUE^Ovary; 

RX MEDLINE=92034730; PubMed-1840502 ; 

RA Coney L.R., Tomassetti A., Carayannopoulos L. , Frasca V., 

RA Kamen B.A., Colnaghi M.I., Zurawski V.R. Jr.; 

RT "Cloning of a tumor-associated antigen: MOvl8 and M0vl9 antibodies 

RT recognize a folate-binding protein."; 

RL Cancer Res. 51:6125-6132(1991). 

RN [5] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=92256496; PubMed=15 8 1364 ; 

RA Sadasivan E. , Cedeno M. , Rothenberg S.P.; 

RT "Genomic organization of the gene and a related pseudogene for a 

RT human folate binding protein."; 

RL Biochim. Biophys . Acta 1131:91-94(1992). 

RN [6] 

RP SEQUENCE OF 24-24 9 FROM N.A. 



RX MEDL INE=89174638; PubMed- 2 53 8429; 

RA Sadasivan E., Rothenberg S.P.; 

RT "The complete amino acid sequence of a human folate binding protein 

RT from KB cells determined from the cDNA . " ; 

RL J. Biol. Chem. 264:58 06-5811(1989). 

RN [7] 

RP SEQUENCE OF 26-43. 

RX MEDLINE=87317689; PubMed=3476960 / 

RA Luhrs C.A., Pitiranggon p., da Costa M. , Rothenberg S.P., 

RA Slomiany B.L., Brink L. , Tous G.I., Stein S.; 

RT "Purified membrane and soluble folate binding proteins from cultured 

RT KB cells have similar amino acid compositions and molecular weights 

RT but differ in fatty acid acylation. " ; 

RL Proc. Natl. Acad. Sci . U.S.A. 84:6546-6549(1987). 

RN [8] 

RP GPI -ANCHOR . 

RX MEDLINE=96062525; PubMed=7578066 ; 

RA Yan W. , Ratnam M. ; 

RT "Preferred sites of glycosylphosphatidyl inositol modification in 

RT folate receptors and constraints in the primary structure of the 

RT hydrophobic portion of the signal."; 

RL Biochemistry 34:14594-14600(1995) . 

CC -!- FUNCTION: BINDS TO FOLATE AND REDUCED FOLIC ACID DERIVATIVES AND 
CC MEDIATES DELIVERY OF 5 -METHYLTETRAHYDROFOLATE TO THE INTERIOR OF 

CC CELLS . 

CC -!- SUBCELLULAR LOCATION: Attached to the membrane by a GPI -anchor. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=2; 

CC Name=l; Synonyms=GP I -anchored ; 

CC IsoId=P15328-l; Sequence=Displayed; 

CC Name=2; Synonyms = Cytoplasmic; 

CC IsoId=P15328-2; Sequence=Not described; 

CC -!- TISSUE SPECIFICITY: FR-ALPHA LEVELS ARE GREATLY ELEVATED IN A 
CC VARIETY OF MALIGNANT TISSUES OF EPITHELIAL ORIGIN COMPARED WITH 

CC NORMAL TISSUES. 

CC -!- PTM: EIGHT DISULFIDE BONDS ARE PRESENT (PROBABLE). 

CC -!- SIMILARITY: BELONGS TO THE FOLATE RECEPTOR FAMILY. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to 1 icense@isb-sib. ch) . 

CC 

DR EMBL; M28099; AAA35822.1; -. 

DR EMBL; X62753; CAA44610.1; -. 

DR EMBL; J05013; AAA35823.1; -. 

DR EMBL; M25317; AAA74896.1; -. 

DR EMBL; U20391; AAB05827.1; -. 

DR PIR; A44904; A45753 . 

DR Genew; HGNC:3791; F0LR1 . 

DR MIM; 136430; -. 

DR GO; GO: 0005887; C: integral to plasma membrane; TAS . 

DR GO; GO: 0005624; C:membrane fraction; TAS . 

DR GO; GO: 0005542; F: folic acid binding activity; TAS. 



DR GO; GO: 0015025; F:GP I -anchored membrane -bound receptor; TAS . 

DR GO; GO: 0015884; P: folate transport; TAS. 

DR GO; GO: 0006898; P:receptor mediated endocytosis; TAS. 

DR InterPro; IPR004269; Folate_rec. 

DR Pfam; PF03 024; Folate_rec ; 1. 

KW Receptor; Glycoprotein; Signal; Folate-binding; Membrane; GPI -anchor; 

KW Alternative splicing; Polymorphism. 



FT 


SIGNAL 


1 


24 


POTENTIAL . 


FT 


CHAIN 


25 


234 


FOLATE RECEPTOR ALPHA. 


FT 


PROPEP 


235 


257 


REMOVED IN MATURE FORM. 


FT 


LIPID 


234 


234 


GPI -ANCHOR. 


FT 


CARBOHYD 


69 


69 


N-LINKED (GLCNAC . . .) (BY SIMILARITY) 


FT 


CARBOHYD 


161 


161 


N-LINKED (GLCNAC. . . ) (BY SIMILARITY) 


FT 


CARBOHYD 


201 


201 


N-LINKED (GLCNAC. . . ) (POTENTIAL). 


FT 


VARIANT 


160 


160 


W -> C (IN dbSNP: 1801932) . 


FT 








/FTId=VAR_011963 . 


FT 


CONFLICT 


184 


184 


T -> S (IN REF. 6) . 


SQ 


SEQUENCE 


257 AA; 


29819 


MW; D458D8BB047C96A6 CRC64 ; 



Query Match 59.7%; Score 40; DB 1; Length 257; 

Best Local Similarity 71.4%; Pred. No. 18; 

Matches 5; Conservative 0; Mismatches 2; Indels 0; Gaps 



Qy 1 CENWWGD 7 

Db 139 CEQWWED 145 



RESULT 7 
ANL3_MOUSE 

ID ANL3_MOUSE STANDARD; PRT; 455 AA. 

AC Q9R182; 

DT 28-FEB-2003 (Rel . 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE Angiopoietin-related protein 3 precursor (Angiopoietin-like 3) . 

GN ANGPTL3 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID==10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-20112762; PubMed=l 0644446 ; 

RA Conklin D., Gilbert son D. , Taft D.W., Maurer M.F., Whitmore T.E., 

RA Smith D.L., Walker K.M., Chen L.H. , Wattler S., Nehls M. , Lewis K.B.; 

RT "Identification of a mammalian angiopoietin-related protein expressed 

RT specifically in liver. 11 ; 

RL Genomics 62:477-482(1999). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE-Liver; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G. , 

RA Klausner R.D., Collins F.S., Wagner L. , Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H. , Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T. , Max S.I., Wang J . , Hsieh F., 



RA Diatchenko L . , Marusina K. , Farmer A. A. , Rubin G.M. , Hong L. , 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G. J. , Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J.A. , Gunaratne P.H., 

RA Richards S. f Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X. , Gibbs R.A., 

RA Fahey J., Helton E., Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M . , Madan A., Young A.C., Shevchenko Y. # Bouffard G.G., 

RA Blakesley R.W., Touchman J.W. , Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J. , Schmutz J., Myers R.M., 

RA Butterfield Y.S.N. , Krzywinski M.I., Skalska U. , Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences . " ; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

CC -!- SUBCELLULAR LOCATION: Secreted (By similarity). 

CC -!- SIMILARITY: Contains 1 fibrinogen C-terminal domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormat ics and the EMBL outstation - 

CC the European Bioinf ormat ics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch) . 

CC 

DR EMBL; AF162224; AAD45920.1; 

DR EMBL; BC019491; AAH19491.1; 

DR HSSP; P02671; 1FZD. 

DR MGD; MGI : 1353627; Angptl3. 

DR InterPro ; I PRO 02 18 1 ; Fibrinogen_C . 

DR Pfam; PF00147; f ibrinogen_C; 1 . 

DR SMART; SM00186; FBG; 1. 

DR PROSITE; PS00514; F I BRI N_AG_C_DOMAI N ; FALSE__NEG . 

KW Signal ; Coiled coil ; Glycoprotein. 



FT 


SIGNAL 


1 


16 


POTENTIAL. 




FT 


CHAIN 


17 


455 


ANGIOPOIETIN- RELATED PROTEIN 3. 


FT 


DOMAIN 


85 


206 


COILED COIL (POTENTIAL) 




FT 


DOMAIN 


241 


455 


FIBRINOGEN C-TERMINAL . 




FT 


DISULFID 


246 


274 


BY SIMILARITY. 




FT 


DISULFID 


394 


408 


BY SIMILARITY. 




FT 


CARBOHYD 


115 


115 


N-LINKED (GLCNAC. . . ) 


(POTENTIAL) 


FT 


CARBOHYD 


232 


232 


N-LINKED ( GLCNAC . . .) 


(POTENTIAL) 


FT 


CARBOHYD 


296 


296 


N-LINKED ( GLCNAC . . .) 


(POTENTIAL) 


FT 


CARBOHYD 


357 


357 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) 


SQ 


SEQUENCE 


455 AA; 


52543 


MW; 31609D3700D3F33D CRC64; 



Query Match 59.7%; Score 40; DB 1; Length 455; 

Best Local Similarity 66.7%; Pred. No. 32; 

Matches 4; Conservative 1; Mismatches 1; Indels 0; Gaps 0; 
Qy 4 WWGDVC 9 

Db 4 03 WWNDIC 4 08 



RESULT 8 
AMY_TRICA 

ID AMYJTRICA STANDARD; PRT; 489 AA. 

AC P09107; 

DT 01-MAR-1989 (Rel . 10, Created) 

DT 01-NOV-1995 {Rel. 32, Last sequence update) 

DT 01-OCT-1996 {Rel. 34, Last annotation update) 

DE Alpha -amylase precursor (EC 3-2.1.1) ( 1 , 4-alpha-D-glucan 

DE glucanohydrolase) (Fragment) . 

OS Tribolium castaneum (Red flour beetle) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota ; Coleoptera; Polyphaga; Cucuj if ormia / 

OC Tenebrionidae; Tribolium. 

OX NCBI_TaxID=7070; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=88200288; PubMed=3 12 9570 ; 

RA Hickey D.A., Benkel B.F., Boer P.H., Genes t Y., Abukashawa S., 

RA Ben-David G. ; 

RT "Enzyme-coding genes as molecular clocks: the molecular evolution of 

RT animal alpha-amylases . " ; 

RL J. Mol. Evol. 26:252-256(1987). 

CC -!- CATALYTIC ACTIVITY: Endohydrolysis of 1 , 4-alpha-glucosidic 

CC linkages in oligosaccharides and polysaccharides, 

CC -!- COFACTOR: BINDS A CALCIUM ION REQUIRED FOR ITS ACTIVITY. 

CC -!- SIMILARITY: BELONGS TO FAMILY 13 OF GLYCOSYL HYDROLASES, ALSO 

CC KNOWN AS THE ALPHA -AMYLASE FAMILY. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch) . 

CC 

DR EMBL; X06905; CAA30009.1; -. 

DR PIR; A29347; A29347. 

DR HSSP; P56634; 1JAE. 

DR InterPro; IPR006589; Alp_amyl_cat_sub . 

DR InterPro; IPR006048; Alpha_amyl__C . 

DR InterPro; IPR006047; Alpha_amyl_cat . 

DR InterPro; IPR006046; Glyco_hydro_13 . 

DR Pfam; PF00128; alpha -amylase ; 1. 

DR Pfam; PF02806; alpha -amyl as e_C ,- 1. 

DR PRINTS; PR00110; ALPHAAMYLASE . 

DR SMART; SM00642; Aamy; 1. 

DR SMART; SM00632; Aamy_C; 1. 

KW Hydrolase; Glycosidase; Carbohydrate metabolism; Calcium; Signal. 

FT N0N_TER 1 1 

FT SIGNAL <1 16 POTENTIAL. 

FT CHAIN 17 48 9 ALPHA -AMYLASE . 

FT ACT_SITE 203 203 BY SIMILARITY. 

FT ACT_SITE 207 207 BY SIMILARITY. 

FT ACT_SITE 3 05 3 05 BY SIMILARITY. 

FT DISULFID 44 102 BY SIMILARITY. 

FT DISULFID 152 166 BY SIMILARITY. 



FT DISULFID 
FT DISULFID 
SQ SEQUENCE 



372 378 BY SIMILARITY . 

443 455 BY SIMILARITY. 

489 AA; 53247 MW; D1AB1 07C4 8FF872 1 CRC64; 



Query Match 58.2%; 
Best Local Similarity 83.3%; 
Matches 5; Conservative 



Score 39; DB 1; 
Pred . No . 4 9; 
0 ; Mismatches 



Length 48 9; 
1; Indels 



0 ; Gaps 



0; 



Qy 

Db 



2 ENWWGD 7 

I I I I I 
401 ENWWSD 406 



RESULT 9 
LCYB_TOBAC 

ID LCYB_TOBAC STANDARD; PRT; 500 AA. 

AC Q43578; 

DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Lycopene beta cyclase, chloroplast precursor (EC 1.14.-.-). 

GN LCY1 OR CRTL-1. 

OS Nicotiana tabacum (Common tobacco) . 

OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta ; Tracheophyta ; 

OC Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicots; 

OC Asteridae; lamiids; Solanales; Solanaceae; Nicotiana. 

OX NCBI_TaxID=4 097; 

RN [1] 

RP SEQUENCE FROM N . A . 

RC STRAIN=cv. Samsun NN; TISSUE=Leaf; 

RX MEDLINE=96434545; PubMed=8837512 ; 

RA Cunningham F.X. Jr., Pogson B . , Sun Z., McDonald K.A. , Dellapenna D., 

RA Gantt E. ; 

RT "Functional analysis of the beta and epsilon lycopene cyclase enzymes 

RT of Arabidopsis reveals a mechanism for control of cyclic carotenoid 

RT formation. " ; 

RL Plant Cell 8:1613-1626(1996). 

CC -!- FUNCTION: CATALYZES THE DOUBLE CYCLIZATION REACTION WHICH CONVERTS 
CC LYCOPENE TO BETA-CAROTENE AND NEUROSPORENE TO BETA- ZEA CAROTENE . 

CC -!- PATHWAY: Carotenoid biosynthes is . 

CC -!- SUBCELLULAR LOCATION: Chloroplast. 

CC -!- SIMILARITY: BELONGS TO THE LYCOPENE CYCLASE FAMILY. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; X81787; CAA57386.1; 

DR PIR; S72506; S72506. 

DR InterPro; IPR000205; NAD_binding. 

KW Oxidoreductase; NAD; Carotenoid biosynthesis; Chloroplast; 

KW Transit peptide. 

FT TRANSIT 1 81 CHLOROPLAST (POTENTIAL) . 



FT CHAIN 82 500 LYCOPENE BETA CYCLASE . 

FT NP_BIND 8 6 114 NAD (POTENTIAL) . 

SQ SEQUENCE 500 AA; 56067 MW; 2E372 1B87EE6CBC8 CRC64 ; 

Query Match 58.2%; Score 39; DB 1; Length 500; 

Best Local Similarity 66.7%; Pred. No. 50; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 CENWWGDVC 9 

I I I I II 
Db 41 CENWGKGVC 4 9 



RESULT 10 
ENV_HV1 EL 

ID ENV_HV1EL STANDARD; PRT; 853 AA. 

AC P04581; 

DT 13-AUG-1987 (Rel . 05, Created) 

DT 13-AUG-1987 (Rel. 05, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Envelope polyprotein GP160 precursor [Contains: Exterior membrane 

DE glycoprotein (GP120) ; Transmembrane glycoprotein (GP41)]. 

GN ENV. 

OS Human immunodeficiency virus type 1 (ELI isolate) (HIV-1) . 

OC Viruses; Retroid viruses; Retroviridae; Lentivirus. 

OX NCB I _Tax I D = 1 1 6 8 9 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=86245056; PubMed=2424 612 ; 

RA Alizon M., Wain-Hobson S., Montagnier L., Sonigo P.; 

RT "Genetic variability of the AIDS virus: nucleotide sequence analysis 

RT of two isolates from African patients."; 

RL Cell 46:63-74 (1986) . 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; K034 54; AAA4432 9.1; -. 

DR EMBL; A07108; CAA00616.1; -. 

DR HIV; K03454; ENV$ELI . 

DR InterPro; IPR000328; Env_GP41. 

DR InterPro; IPR000777; GP120. 

DR Pfam; PF00516; GP120; 1. 

DR Pfam; PF00517; GP41; 1. 

KW AIDS; Coat protein; Polyprotein; Glycoprotein; Transmembrane; 



KW 


Signal . 








FT 


SIGNAL 


1 


31 


BY SIMILARITY. 


FT 


CHAIN 


32 


508 


EXTERIOR MEMBRANE GLYCOPROTEIN 


FT 


CHAIN 


509 


853 


TRANSMEMBRANE GLYCOPROTEIN. 


FT 


DISULFID 


53 


73 


BY SIMILARITY . 


FT 


DISULFID 


118 


206 


BY SIMILARITY. 


FT 


DISULFID 


125 


197 


BY SIMILARITY. 



FT 


DISULFID 


130 


154 




BY 


SIMILARITY. 




FT 


DISULFID 


219 


248 




BY 


SIMILARITY. 




FT 


DISULFID 


229 


240 




BY 


SIMILARITY . 




FT 


DISULFID 


297 


330 




BY 


SIMILARITY. 




FT 


DISULFID 


376 


442 




BY 


SIMILARITY. 




FT 


DISULFID 


383 


416 




BY 


SIMILARITY . 




FT 


CARBOHYD 


87 


87 




N- 


LINKED 


(GLCNAC. 


. . ) (POTENTIAL) 


FT 


CARBOHYD 


129 


129 




N- 


LINKED 


(GLCNAC. 


. . ) (POTENTIAL) 


FT 


CARBOHYD 


137 


137 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


143 


143 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


153 


153 




N- 


LINKED 


(GLCNAC. 


. . ) (POTENTIAL) 


FT 


CARBOHYD 


157 


157 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


183 


183 




N- 


LINKED 


(GLCNAC. 


. . ) (POTENTIAL) 


FT 


CARBOHYD 


188 


188 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


198 


198 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


235 


235 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


242 


242 




N- 


LINKED 


(GLCNAC. 


. . ) (POTENTIAL) 


FT 


CARBOHYD 


263 


263 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


277 


277 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


290 


290 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


331 


331 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


353 


353 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


384 


384 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


390 


390 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


394 


394 




N- 


LINKED 


(GLCNAC. 


. . ) (POTENTIAL) 


FT 


CARBOHYD 


400 


400 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


405 


405 




N- 


LINKED 


(GLCNAC. 


. . ) (POTENTIAL) 


FT 


CARBOHYD 


406 


406 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


411 


411 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


445 


445 




N- 


LINKED 


(GLCNAC. 


. . ) (POTENTIAL) 


FT 


CARBOHYD 


458 


458 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOnYD 


A ZZ Q 

4 b y 


459 




IN - 


LINKED 


(GLCNAC. 


) { PHT^WTT AT.l 

. } \ iT \J 1 XjIM X -L XT.J-J ) 


FT 


CARBOHYD 


462 


462 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


608 


608 




N- 


LINKED 


(GLCNAC. 


. . ) (POTENTIAL) 


FT 


CARBOHYD 


613 


613 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


622 


622 




N- 


LINKED 


(GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


634 


634 




N- 


LINKED 


( GLCNAC . 


. .) (POTENTIAL) 


SQ 


SEQUENCE 


853 AA; 


96721 


MW; 


F9CD864DAA0D07A5 


CRC64 ; 


Query Match 




58 


2% 


Score 39/ 


DB 1; 


Length 853; 


Best Local Similarity 


80 


0% 


; Pred . No . 


83; 





Matches 



4; Conservative 



1; Mismatches 



0; Indels 



0; Gaps 



0; 



Qy 

Db 



1 CENWW 5 

hill 

10 CQNWW 14 



RESULT 11 
ADP1_YEAST 

ID ADP1_YEAST STANDARD; PRT; 104 9 AA. 

AC P25371; 

DT 01-MAY-1992 (Rel . 22, Created) 
DT 01-MAY-1992 (Rel. 22, Last sequence update) 
DT 16-OCT-2001 (Rel. 40, Last annotation update) 
DE Probable ATP-dependent permease precursor. 
GN ADP1 OR YCR011C OR YCRllC OR YCR105. 



OS Saccharomyces cerevisiae (Baker's yeast) . 

OC Eukaryota; Fungi; Ascomycota; Saccharomycotina ; Saccharomycetes ; 

OC Saccharomycetales; Saccharomycetaceae; Saccharomyces . 

OX NCBI_TaxID=4 932; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=92160395; PubMed=178 90 0 9 ; 

RA Purnelle B., Skala J., Goffeau A.; 

RT "The product of the YCR105 gene located on the chromosome III from 

RT Saccharomyces cerevisiae presents homologies to ATP-dependent 

RT permeases . " ; 

RL Yeast 7:867-872(1991). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-92327849; PubMed=162 6432 ; 

RA Skala J., Purnelle B., Goffeau A. ; 

RT "The complete sequence of a 10.8 kb segment distal of SUF2 on the 

RT right arm of chromosome III from Saccharomyces cerevisiae reveals 

RT seven open reading frames including the RVS161, ADP1 and PGK genes."; 

RL Yeast 8:409-417(1992). 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein (Potential). 

CC -!- SIMILARITY: BELONGS TO THE ABC TRANSPORTER FAMILY. MDR SUBFAMILY. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; X59720; CAA42328.1; -. 

DR PIR; S19421; S19421. 

DR SGD; S0000604; ADP1 . 

DR GO; GO: 0005783; C : endoplasmic reticulum; IDA . 

DR InterPro; I PRO 03 593; AAA_ATPase . 

DR InterPro; IPR003439; ABC_transporter . 

DR Pfam; PF00005; ABC_tran; 1. 

DR ProDom; PD000006; ABC_transporter ; 1. 

DR SMART; SM00382; AAA; 1. 

DR PROSITE; PS00211; ABC_TRANSPORTER_l ; 1. 

DR PROSITE; PS50893; ABC_TRANSPORTER_2 ; 1. 

ATP-binding; Transmembrane; Glycoprotein; Transport; Signal. 



KW 



FT 


SIGNAL 


1 


25 


POTENTIAL . 


FT 


CHAIN 


26 


1049 


PROBABLE ATP-DEPENDENT 


FT 


NP_BIND 


423 


430 


ATP (BY SIMILARITY) . 


FT 


TRANSMEM 


325 * 


345 


POTENTIAL. 


FT 


TRANSMEM 


464 


481 


POTENTIAL . 


FT 


TRANSMEM 


794 


814 


POTENTIAL. 


FT 


TRANSMEM 


829 


849 


POTENTIAL. 


FT 


TRANSMEM 


878 


898 


POTENTIAL . 


FT 


TRANSMEM 


910 


930 


POTENTIAL. 


FT 


TRANSMEM 


938 


958 


POTENTIAL. 


FT 


TRANSMEM 


1001 


1021 


POTENTIAL. 


FT 


TRANSMEM 


1025 


1045 


POTENTIAL . 


FT 


CARBOHYD 


50 


50 


N -LINKED (GLCNAC. . .) 


FT 


CARBOHYD 


114 


114 


N- LINKED (GLCNAC. . .) 



(POTENTIAL) . 
(POTENTIAL) . 



FT CARBOHYD 165 165 N-LINKED ( GLCNAC . . .) (POTENTIAL) . 

FT CARBOHYD 221 221 N-LINKED ( GLCNAC . . . ) (POTENTIAL). 

FT CARBOHYD 815 815 N-LINKED (GLCNAC. . .) (POTENTIAL). 

FT CARBOHYD 935 935 N-LINKED (GLCNAC. . .) (POTENTIAL). 

FT CARBOHYD 960 960 N-LINKED (GLCNAC. . .) (POTENTIAL). 

FT CARBOHYD 971 971 N-LINKED (GLCNAC. . .) (POTENTIAL). 

SQ SEQUENCE 104 9 AA; 117231 MW; ABC9CE54BCFDF6A3 CRC64 ; 

Query Match 58.2%; Score 39; DB 1; Length 1049; 

Best Local Similarity 53.8%; Pred. No. le+02; 

Matches 7; Conservative 1; Mismatches 1; Indels 4; Gaps 1; 

Qy 1 CENWWG DVC 9 

hi II III 
Db 119 CDNGWGG I N CD VC 131 

RESULT 12 
EX5C_HAEIN 

ID EX5C_HAEIN STANDARD; PRT; 1121 AA. 

AC P44945; 

DT 01-NOV-1995 (Rel . 32, Created) 

DT 01-NOV-1995 (Rel. 32, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Exodeoxyribonuclease V gamma chain (EC 3.1.11.5) . 

GN RECC OR HI 0942. 

OS Haemophilus influenzae. 

OC Bacteria; Proteobacteria; Gammaproteobacteria; Pasteurellales ; 

OC Pasteurellaceae; Haemophilus . 

OX NCBI_TaxID=727 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Rd / KW2 0 / ATCC 51907; 

RX MEDLINE=95350630; PubMed-75428 0 0 ; 

RA Fleischmann R.D. , Adams M.D. , White 0., Clayton R.A., Kirkness E.F., 

RA Kerlavage A.R., Bult C.J., Tomb J.-F., Dougherty B.A., Merrick J.M., 

RA McKenney K. , Sutton G. , Fitzhugh W. , Fields C.A. , Gocayne J.D., 

RA Scott J.D., Shirley R. , Liu L.-I., Glodek A., Kelley J.M. , 

RA Weidman J.F., Phillips C.A. , Spriggs T. , Hedblom E., Cotton M.D., 

RA Utterback T.R., Hanna M.C., Nguyen D.T., Saudek D.M., Brandon R.C., 

RA Fine L.D., Fritchman J.L., Fuhrmann J.L., Geoghagen N.S.M., 

RA Gnehm C.L., McDonald L.A. , Small K.V. , Fraser CM., Smith H.O. , 

RA Venter J. C. ; 

RT "Whole -genome random sequencing and assembly of Haemophilus influenzae 

RT Rd . " ; 

RL Science 269:496-512(1995). 

CC -!- FUNCTION: EXHIBITS A WIDE VARIETY OF CATALYTIC ACTIVITIES 

CC INCLUDING ATP - DEPENDENT EXONUCLEASE, ATP -STIMULATED ENDONUCLEASE, 

CC ATP -DEPENDENT HELICASE AND DNA-DEPENDENT AT PAS E ACTIVITIES 

CC (BY SIMILARITY) . 

CC -!- CATALYTIC ACTIVITY: Exonucleolyt ic cleavage (in the presence of 
CC ATP) in either 5 * - to 3 1 -or 3 1 - to 5* -direction to yield 5'- 

CC phosphooligonucleotides . 

CC -!- SUBUNIT: CONSIST OF THREE SUBUNITS; RECB, RECC AND RECD 
CC (BY SIMILARITY) . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 



CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U32775; AAC22596.1; 

DR PIR; G64103; G64103. 

DR TIGR; HI 0942; 

DR InterPro; IPR006347; ExoDNase_Vg. 

DR Pfam; PF04257; Exonuc_V_gamma ; 1. 

DR TIGRFAMs; TIGR01450; recC; 1. 

KW Hydrolase; Nuclease; Exonuclease; Endonuclease; Helicase; DNA repair; 

KW Complete proteome. 

SQ SEQUENCE 1121 AA; 129668 MW; E5 07 0957296AE0D3 CRC64 ; 

Query Match 58.2%; Score 39; DB 1; Length 1121; 

Best Local Similarity 50.0%; Pred. No. l.le+02; 

Matches 4; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 
Qy 1 CENWWGDV 8 

Db 2 63 CQEYWGDI 270 



RESULT 13 
FOLl_BOVIN 

ID F0L1_B0VIN STANDARD; PRT; 222 AA. 

AC P02702; 

DT 21-JUL-1986 (Rel . 01, Created) 

DT 23-OCT-1986 (Rel. 02, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Milk folate-binding protein (FBP) (Folate receptor alpha) . 

OS Bos taurus (Bovine) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetart iodactyla ; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Bovinae; Bos. 

OX NCBI_TaxID=9913; 

RN [1] 

RP SEQUENCE . 

RC TISSUE=Milk; 

RA Svendsen I., Hansen S.I., Holm J., Lyngbye J. ; 

RT "The complete amino acid sequence of the folate-binding protein from 

RT cow' s milk. " ; 

RL Carlsberg Res . Commun. 49:123-131(1984). 

RN [2] 

RP SEQUENCE OF 1-62; 72-102 AND 192-222. 

RC TISSUE=Milk; 

RA Svendsen I., Martin B., Pedersen T.G., Hansen S.I., Holm J., 

RA Lyngbye J . ; 

RT "Isolation and characterization of the folate-binding protein from 

RT cow' s milk. " ; 

RL Carlsberg Res. Commun. 44:89-99(1979). 

CC -!- FUNCTION: BINDS TO FOLATE AND REDUCED FOLIC ACID DERIVATIVES AND 
CC MEDIATES DELIVERY OF 5 -METHYLTETRAHYDROFOLATE TO THE INTERIOR OF 

CC CELLS . 



CC -!- PTM: EIGHT DISULFIDE BONDS ARE PRESENT. 

CC -!- SIMILARITY: BELONGS TO THE FOLATE RECEPTOR FAMILY. 

DR PIR; A03161; BFBO . 

DR InterPro; IPR004269; Folate__rec. 

DR Pfam; PF03 024; Folate_rec; 1. 

KW Receptor; Glycoprotein; Milk; Folate-binding . 

FT CARBOHYD 49 49 N- LINKED (GLCNAC. . . ). 

FT CARBOHYD 141 141 N-LINKED (GLCNAC. . . ). 

SQ SEQUENCE 222 AA; 25825 MW; 52 8C388E9A9C0484 CRC64; 



Query Match 56.7%; Score 38; DB 1 ; Length 222; 

Best Local Similarity 57,1%; Pred. No. 32; 

Matches 4; Conservative 2; Mismatches 1; Indels 0; Gaps 0; 
Qy 1 CENWWGD 7 

Db 119 CQSWWED 125 



RESULT 14 
CYA8 RAT 



ID CYA8_RAT STANDARD; PRT; 124 8 AA. 

AC P40146; 

DT 01-FEB-1995 (Rel . 31, Created) 

DT 01-FEB-1995 (Rel. 31, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE Adenylate cyclase, type VIII (EC 4.6.1.1) (ATP pyrophosphate -lyase) 

DE (Ca (2+) /calmodulin activated adenylyl cyclase). 

GN ADCY8 . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxI D= 1 0 1 1 6 ; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC STRAIN=Sprague-Dawley; TISSUE=Brain; 

RX MEDLINE=94216337; PubMed=8 163 524 ; 

RA Cali J.J., Zwaagstra J.C., Mons N., Cooper D.M. , Krupinski J.; 

RT "Type VIII adenylyl cyclase. A Ca2+/calmodulin-stimulated enzyme 

RT expressed in discrete regions of rat brain."; 

RL J. Biol. Chem. 269:12190-12195(1994). 

CC -!- FUNCTION: This is a membrane -bound, calcium- inhibitable adenylyl 
CC cyclase. May be involved in learning, in memory and in drug 

CC dependence. 

CC -!- CATALYTIC ACTIVITY: ATP = 3', 5' -cyclic AMP + diphosphate. 

CC -!- COFACTOR: Binds 2 magnesium ions per subunit (By similarity). 

CC -!- ENZYME REGULATION: Activated by calcium/calmodul in . 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: BRAIN. 

CC -!- DOMAIN: COMPOSED OF TWO HOMOLOGOUS DOMAINS. 

CC -!- SIMILARITY: Belongs to the adenylyl cyclase class-4/guanylyl 
CC cyclase family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 



CC modified and this statement is not removed. Usage by and for commercial 
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
CC or send an email to license@isb-sib.ch) . 



CC 










DR 


EMBL; L26986; AAA20504.1; - 




DR 


PIR; A53588; A53588. 




DR 


HSSP; P19754; 1AWK. 




DR 


InterPro, 


; IPR001054; G_cyclase. 


DR 


Pfam; PF00211; guanylate eye; 2. 


DR 


SMART; SM00044; CYCc ; 2. 




DR 


PROSITE; 


PS00452; 


GUANYLATE_ 


_CYCLASES 1; 2. 


DR 


PROSITE; 


PS50125; 


guanylate" 


CYCLASES 2; 2. 


KW 


Lyase; cAMP biosynthesis; Transmembrane; Glycoprotein; Repeat; 


KW 


Metal -binding ; Magnes ium . 




FT 


DOMAIN 


1 


179 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


180 


200 


POTENTIAL . 


FT 


TRANSMEM 


209 


229 


POTENTIAL. 


FT 


TRANSMEM 


244 


264 


POTENTIAL . 


FT 


TRANSMEM 


271 


291 


POTENTIAL. 


FT 


TRANSMEM 


293 


313 


POTENTIAL . 


FT 


TRANSMEM 


318 


338 


POTENTIAL . 


FT 


DOMAIN 


339 


712 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


713 


733 


POTENTIAL. 


FT 


TRANSMEM 


735 


755 


POTENTIAL . 


in ] i 
b 1 


TRANSMEM 


784 


804 


POTENTIAL. 


FT 


TRANSMEM 


828 


848 


POTENTIAL . 


FT 


TRANSMEM 


858 


878 


POTENTIAL . 


FT 


TRANSMEM 


891 


911 


POTENTIAL. 


FT 


DOMAIN 


912 


1248 


CYTOPLASMIC (POTENTIAL) . 


FT 


METAL 


416 


416 


MAGNESIUM 1 AND 2 (BY SIMILARITY) . 


FT 


METAL 


417 


417 


MAGNESIUM 2 (VIA CARBON YL OXYGEN) 


FT 








SIMILARITY) . 


FT 


METAL 


460 


460 


MAGNESIUM 1 AND 2 (BY SIMILARITY) . 


FT 


CARBOHYD 


814 


814 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


CARBOHYD 


818 


818 


N-LINKED (GLCNAC. . . ) (POTENTIAL) 


FT 


CARBOHYD 


885 


885 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


SQ 


SEQUENCE 


1248 AA 


; 139822 


MW; 0171A3CEED034961 CRC64; 


Query Match 




56.7%; 


Score 38; DB 1; Length 124 8; 


Best Local Similarity 


55.6%; 


Pred. No. 1.7e+02; 


Matches 5; Conservative 


2; Mismatches 2; Indels 0; 


Qy 


1 


CENWWGDVC 


9 




Db 


1050 


CEDKWGHLC 


1058 





RESULT 15 
CYA8_MOUSE 

ID CYA8JYIOUSE STANDARD; PRT; 124 9 AA. 

AC P97490; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE Adenylate cyclase, type VIII (EC 4.6.1.1) (ATP pyrophosphate -lyase) 
DE (Ca (2+) /calmodulin activated adenylyl cyclase). 
GN ADCY8 . 



OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCB I _Tax ID-10090; 
RN [1] 

RP SEQUENCE FROM N , A . 

RC STRAIN-BALB/ c ; TISSUE=Brain ; 

RA Premont R.T. ; 

RL Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 

CC -!- FUNCTION: This is a membrane -bound, calcium-inhibitable adenylyl 
CC cyclase. May be involved in learning, in memory and in drug 

CC dependence. 

CC -!- CATALYTIC ACTIVITY: ATP = 3 ',5' -cyclic AMP + diphosphate. 

CC -!- COFACTOR: Binds 2 magnesium ions per subunit (By similarity). 

CC -!- ENZYME REGULATION: Activated by calcium/calmodulin . 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- DOMAIN: COMPOSED OF TWO HOMOLOGOUS DOMAINS. 

CC -!- SIMILARITY: Belongs to the adenylyl cyclase class-4/guanylyl 
CC cyclase family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U85021; AAB41885.1; 

DR HSSP; P19754; 1AWK. 



DR 


MGD; MGI: 


1341110; 


Adcy8 , 








DR 


InterPro; 


IPR001054; G cyclase. 




DR 


Pfam; PF00211; guanylate 


eye ; 2 . 




DR 


SMART; SM00 044; CYCc; 2. 








DR 


PROSITE; 


PS00452; 


GUANYLATE 


CYCLASES 1; 


2. 


DR 


PROSITE; 


PS50125; 


GUANYLATE 


"CYCLASES 2; 


2. 


KW 


Lyase; cAMP biosynthesis; 


Transmembrane ; 


Glycoprotein; Repeat ; 


KW 


Metal -binding; Magnesium. 






FT 


DOMAIN 


1 


180 




CYTOPLASMIC 


(POTENTIAL) . 


FT 


TRANSMEM 


181 


201 




POTENTIAL . 




FT 


TRANSMEM 


210 


230 




POTENTIAL . 




FT 


TRANSMEM 


245 


265 




POTENTIAL. 




FT 


TRANSMEM 


272 


292 




POTENTIAL . 




FT 


TRANSMEM 


294 


314 




POTENTIAL . 




FT 


TRANSMEM 


319 


339 




POTENTIAL. 




FT 


DOMAIN 


340 


713 




CYTOPLASMIC 


(POTENTIAL) . 


FT 


TRANSMEM 


714 


734 




POTENTIAL. 




FT 


TRANSMEM 


736 


756 




POTENTIAL . 




FT 


TRANSMEM 


785 


805 




POTENTIAL . 




FT 


TRANSMEM 


829 


849 




POTENTIAL . 




FT 


TRANSMEM 


859 


879 




POTENTIAL . 




FT 


TRANSMEM 


892 


912 




POTENTIAL. 




FT 


DOMAIN 


913 


1249 




CYTOPLASMIC 


(POTENTIAL) . 


FT 


METAL 


417 


417 




MAGNESIUM 1 


AND 2 (BY SIMILARITY) 


FT 


METAL 


418 


418 




MAGNESIUM 2 


(VIA CARBONYL OXYGEN) 


FT 










SIMILARITY) 




FT 


METAL 


461 


461 




MAGNESIUM 1 


AND 2 (BY SIMILARITY) 



FT CARBOHYD 815 815 N-LINKED ( GLCNAC . . . ) (POTENTIAL) 

FT CARBOHYD 819 819 N-LINKED (GLCNAC . . .) (POTENTIAL) 

FT CARBOHYD 886 886 N-LINKED (GLCNAC. . .) (POTENTIAL) 

SQ SEQUENCE 1249 AA; 140154 MW; B2FE5670E9A74DAF CRC64 ; 



Query Match 5 6.7^ 

Best Local Similarity 55.63 
Matches 5; Conservative 

Qy 1 CENWWGDVC 9 



Score 38; DB 1; Length 124 9; 
Pred. No. 1.7e+02; 
2; Mismatches 2; Indels 



0 ; Gaps 



0; 



Db 



1051 CEDKWGHLC 1059 



Search completed: November 13, 2003, 09:46:31 
Job time : 7.15625 sees 

GenCore version 5.1.6 
Copyright (c) 1993 - 2 0 03 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: November 13, 2003, 09:31:40 ; Search time 23.7188 Seconds 

(without alignments) 
97.917 Million cell updates/sec 

Title: US- 09 -228 -866 -2 

Perfect score: 67 

Sequence: 1 CENWWGDVC 9 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 830525 seqs, 258052604 residues 

Total number of hits satisfying chosen parameters: 830525 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : SPTREMBL 23:* 



1 


sp__archea : * 


2 


sp_bacteria : * 


3 


sp_f ungi : * 


4 


sp__human : * 


5 


sp__invertebrate : * 


6 


sp_mammal : * 


7 


sp__mhc : * 


8 


sp_organelle: * 


9: 


spjphage : * 


10: sp__plant:* 


11: sp_rodent : * 



12: sp_virus : * 

13 : sp_vertebrate : * 

14 : sp_unclassif ied : * 

15: sp_rvirus:* 

16: sp__bacteriap: * 

17: sp_archeap:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

Result Query 



>TO. 


Score 


Match Length 


DB 


ID 


Description 


1 


47 


70 


. 1 


252 


5 


Q9VT15 


Q9vtl5 drosophila 


2 


44 


65 


. 7 


267 


5 


Q21692 


Q21692 caenorhabdi 


3 


44 


65 


. 7 


1132 


16 


Q8YLA5 


Q8yla5 anabaena sp 


4 


43 


64 


.2 


249 


6 


Q9XSH1 


Q9xshl sus scrofa 


5 


42 


62 


. 7 


1008 


4 


Q8IVQ8 


Q8ivq8 homo sapien 


6 


41 


61 


.2 


240 


3 


093877 


093877 fusarium ox 


7 


41 


61 


.2 


242 


3 


Q04701 


Q04701 fusarium so 


8 


41 


61 


. 2 


244 


3 


Q00851 


Q00851 nectria hae 


9 


41 


61 


. 2 


333 


6 


Q9GKT2 


Q9gkt2 macaca fasc 


10 


41 


61 


. 2 


333 


6 


Q9BGQ4 


Q9bgq4 macaca fasc 


11 


41 


61 


. 2 


401 


5 


Q21938 


Q2193 8 caenorhabdi 


12 


41 


61 


2 


425 


17 


Q8PVY9 


Q8pvy9 methanosarc 


13 


41 


61 


2 


735 


16 


Q8PPY7 


Q8ppy7 xanthomonas 


14 


41 


61 


2 


744 


17 


Q8TJY7 


Q8tjy7 methanosarc 


15 


40 


59 


7 


85 


4 


014597 


014597 homo sapien 


16 


40 


59 


7 


274 


13 


Q9PW81 


Q9pw81 gallus gall 


17 


40 


59 


7 


337 


2 


Q93LL2 


Q93112 nostoc line 


18 


40 


59 


7 


425 


16 


025142 


025142 helicobacte 


19 


40 


59 


7 


430 


5 


Q8IMC1 


Q8imcl drosophila 


20 


40 


59 


7 


476 


16 


025366 


025366 helicobacte 


21 


40 


59 


7 


478 


2 


030511 


030511 helicobacte 


22 


40 


59 


7 


501 


5 


Q9GNS9 


Q9gns9 trypanosoma 


23 


40 


59 


7 


572 


5 


Q9V4E7 


Q9v4e7 drosophila 


24 


40 


59 


7 


582 


16 


Q9Z892 


Q9z8 92 chlamydia p 


25 


40 


59 


7 


629 


16 


Q9JS50 


Q9js50 chlamydia p 


26 


40 


59. 


7 


685 


4 


Q8IWK5 


Q8iwk5 homo sapien 


27 


40 


59. 


7 


718 


3 


094277 


094277 schizosacch 


28 


40 


59, 


7 


767 


10 


Q8GSZ3 


Q8gsz3 oryza sativ 


29 


40 


59. 


7 


799 


10 


Q9ZTJ7 


Q9zt j 7 lycopersico 


30 


40 


59. 


7 


826 


2 


Q8KL08 


Q8kl0 8 rhizobium e 


31 


40 


59. 


7 


861 


10 


Q9SLS3 


Q9sls3 nicotiana t 


32 


40 


59. 


7 


884 


16 


Q8P5B4 


Q8p5b4 xanthomonas 


33 


40 


59. 


7 


926 


4 


Q8TE4 9 


Q8te4 9 homo sapien 


34 


40 


59. 


7 


926 


11 


Q8R554 


Q8r5 54 mus musculu 


35 


40 


59. 


7 


944 


10 


Q9ZTJ9 


Q9zt j 9 lycopersico 


36 


40 


59. 


7 


968 


10 


Q9ZTK1 


Q9ztkl lycopersico 


37 


40 


59. 


7 


1016 


10 


Q9ZTJ6 


Q9zt j 6 lycopersico 


38 


40 


59. 


7 


1112 


10 


Q41397 


Q4 13 97 lycopersico 


39 


40 


59. 


7 


1112 


10 


Q41398 


Q413 98 lycopersico 


40 


40 


59. 


7 


1148 


16 


Q9KPP4 


Q9kpp4 vibrio chol 


41 


40 


59. 


7 


1270 


16 


Q8EF44 


Q8ef44 shewanella 



42 


39 


58 


.2 


74 


17 


029430 


43 


39 


58 


.2 


124 


5 


Q8MN86 


44 


39 


58 


.2 


207 


10 


Q9LE73 


45 


39 


58 


.2 


208 


8 


Q9G9G1 



02943 0 archaeoglob 
Q8mn86 dictyosteli 

Q91e73 arabidopsis 
Q9g9gl lasaea sp. 



ALIGNMENTS 



RESULT 1 
Q9VT15 

ID Q9VT15 PRELIMINARY; PRT; 252 AA. 

AC Q9VT15; 

DT 01-MAY-2000 (TrEMBLrel . 13, Created) 

DT 01-MAY-2000 (TrEMBLrel . 13, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE CG3088 protein (GH14734p) . 

GN CG3 088. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota / Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCB I JTax I D= 7 2 2 7 ; 

RN [1] 

RP SEQUENCE FROM N . A . 

RC STRAIN=BERKELEY ; 

RX MEDLINE=20196006; PubMed=1073 1 132 ; 

RA Adams M.D., Celniker S.E., Holt R.A. , Evans C.A. , Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A. , Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q. , Chen L.X. , 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D. , 

RA Wan K.H., Doyle C, Baxter E.G., Helt G. , Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch C, Baldwin D. , 

RA Ballew R.M. , Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D. , Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A. , Butler H. , Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C. , Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K. , Doup L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P. 

RA Durbin K.J., Evangelista C.C., Ferraz C. , Ferriera S., Fleischmann W. 

RA Fosler C, Gabriel ian A.E., Garg N.S., Gelbart W.M. , Glasser K. , 

RA Glodek A. , Gong P., Gorrell J.H., Gu Z., Guan P., Harris M. , 

RA Harris N.L., Harvey D., Heiman T. J. , Hernandez J.R., Houck J., 

RA Host in D., Houston K.A. , Howland T.J., Wei M.-H., Ibegwam C. , 

RA Jalali M . , Kalush F., Karpen G.H., Ke Z., Kennison J. A. , Ketchum K.A. 

RA Kimmel B.E., Kodira CD., Kraft C. , Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y. , Levi t sky A. A. , Li J., Li Z., Liang Y. , Lin X. , 

RA Liu X., Mattei B. , Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V. , Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M., Murphy B., Murphy L. , Muzny D.M. , Nelson D.L., 

RA Nelson D.R., Nelson K.A. , Nixon K. , Nusskern D.R., Pacleb J.M., 

RA Palazzolo M. , Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G. , 

RA Reinert K. , Remington K. , Saunders R.D.C., Scheeler P., Shen H . , 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T. , 

RA Spier E . , Spradling A.C., Stapleton M. , Strong R. , Sun E., 



RA Svirskas R. , Tector C. , Turner R. , Venter E. , Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A. , Weinstock G.M., Weissenbach J. , 

RA Williams S.M., Woodage T. , Worley K.C., Wu D. , Yang S., Yao Q.A., 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M., Zhang G., Zhao Q., Zheng L. , 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X. , Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A. , Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster . " ; 

RL Science 287:2185-2195(2000). 

RN [2] 

RP SEQUENCE FROM N . A . 

RC STRAIN=Berkeley; 

RA Stapleton M. , Brokstein P., Hong L. , Agbayani A., Carlson J., 

RA Champe M., Chavez C. , Dorsett V., Dresnek D. , Farfan D. , Frise E., 

RA George R., Gonzalez M. , Guarin H., Kronmiller B., Li P., Liao G., 

RA Miranda A., Mungall C.J., Nunoo J., Pacleb J., Paragas V., Park S., 

RA Pat el S., Phouanenavong S., Wan K. , Yu C, Lewis S.E., Rubin G.M., 

RA Celniker S. ; 

RL Submitted (APR-2002) to the EMBL/GenBank/DDBJ databases. 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY SI. 

DR EMBL; AE003551; AAF50241.1; -. 

DR EMBL; AY094713; AAM11066.1; 

DR HSSP; P00766; 1GCT. 

DR FlyBase; FBgn0036015; CG3088. 

DR InterPro; IPR001314; Chymotrypsin. 

DR InterPro; IPR001284; Ribosomal_L34E . 

DR InterPro; IPR001254; Ser_protease_Try . 

DR Pfam; PF00089; trypsin; 1. 

DR PRINTS; PRO 0722; CHYMOTRYPSIN. 

DR SMART; SM0 002 0; Tryp_SPc; 1. 

DR PROSITE; PS01145; RIBOSOMAL_L34E; 1. 

DR PROSITE; PS5 024 0; TRYPSIN_DOM; 1. 

KW Hydrolase; Protease; Serine protease. 

SQ SEQUENCE 252 AA; 26882 MW; 72F94448CF4 1DCEB CRC64; 

Query Match 70.1%; Score 47; DB 5; Length 252; 

Best Local Similarity 75.0%; Pred. No. 5.9; 

Matches 6; Conservative 1; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 ENWWGDVC 9 

I I I I 

Db 13 8 ENWWANVC 145 



RESULT 
Q21692 
ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 



OC 
OX 
RN 



Created) 

Last sequence update) 
Last annotation update) 



Q21692 PRELIMINARY; PRT; 267 AA. 

Q21692; 

01 -NOV- 19 96 (TrEMBLrel. 01, 
01 -NOV- 19 96 (TrEMBLrel. 01, 
01-MAR-2003 (TrEMBLrel. 23, 
Hypothetical 31.4 kDa protein. 
R04A9.3. 

Caenorhabditis elegans. 

Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 
Rhabditidae; Peloderinae; Caenorhabditis . 
NCBI_TaxID-623 9; 
[1] 



RP SEQUENCE FROM N . A. 

RC STRAIN=BriStol N2 ; 

RX MEDLINE=99069613; PubMed-9851916 ; 

RA None ; 

RT "Genome sequence of the nematode C. elegans : a platform for 

RT investigating biology. The C. elegans Sequencing Consortium."; 

RL Science 282:2012-2018(1998). 

RN [2] 

RP SEQUENCE FROM N . A. 

RC STRAIN=Bristol N2 ; 

RA Geisel C. ; 

RT "The sequence of C. elegans cosmid R04A9 . " ; 

RL Submitted (DEC-1995) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N . A . 

RC STRAIN=Bristol N2 ; 

RA Waterston R. ; 

RT "Direct Submission."; 

RL Submitted (SEP-2001) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; U41550; AAA83285.1; 

DR WormPep; R04A9.3; CE04790. 

KW Hypothetical protein. 

SQ SEQUENCE 267 AA; 31377 MW; B95 1DAF6D7EA3A0B CRC64 ; 

Query Match 65.7%; Score 44; DB 5; Length 267; 

Best Local Similarity 62.5%; Pred. No. 19; 

Matches 5; Conservative 2; Mismatches 1; Indels 0; Gaps 0; 
Qy 1 CENWWGDV 8 

h Mlh 

Db 19 CQAWWGDL 26 



PRELIMINARY; 



PRT; 1132 AA. 
Created) 

Last sequence update) 
Last annotation update) 



RESULT 3 
Q8YLA5 

ID Q8YLA5 

AC Q8YLA5; 

DT 01-MAR-2002 (TrEMBLrel . 20, 

DT 01-MAR-2002 (TrEMBLrel. 20, 

DT 01-MAR-2002 (TrEMBLrel. 20, 

DE Hypothetical protein A117030. 

GN ALL703 0. 

OS Anabaena sp. (strain PCC 7120) . 

OG Plasmid pCC712 Oalpha . 

OC Bacteria; Cyanobacteria ; Nostocales; Nostocaceae; Nostoc . 

OX NCBI_TaxID=103690; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-21595285; PubMed=1175984 0 ; 

RA Kaneko T., Nakamura Y. , Wolk CP., Kuritz T. , Sasamoto S., 

RA Watanabe A., Iriguchi M. , Ishikawa A., Kawashima K. , Kimura T. 

RA Kishida Y., Kohara M. , Matsumoto M., Matsuno A., Muraki A., 

RA Nakazaki N., Shimpo S., Sugimoto M. , Takazawa M. , Yamada M. , 

RA Yasuda M., Tabata S. ; 

RT "Complete genomic sequence of the filamentous nitrogen-fixing 

RT cyanobacterium Anabaena sp. strain PCC 7120."; 

RL DNA Res. 8:2 05-213(2001). 



DR EMBL; AP003600; BAB78114.1; 

KW Plasmid; Hypothetical protein; Complete proteome. 

SQ SEQUENCE 1132 AA; 128203 MW; 23 6D2A33694B3 6F1 CRC64 ; 



Query Match 65.7%; Score 44; DB 16; Length 1132; 

Best Local Similarity 62.5%; Pred. No. 80; 

Matches 5; Conservative 2; Mismatches 1; Indels 0; Gaps 
Qy 1 CENWWGDV 8 

Db 1065 CDSWWGQV 1072 



RESULT 
Q9XSH1 



ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
OX 
RN 
RP 
RC 
RA 
RA 
RT 
RT 
RL 
DR 
DR 
DR 
SQ 



Euteleostomi; 
Sus . 



Q9XSH1 PRELIMINARY; PRT; 249 AA. 

Q9XSH1 ; 

01-NOV-1999 (TrEMBLrel . 12, Created) 
01-NOV-1999 (TrEMBLrel. 12, Last sequence update) 
01-DEC-2001 (TrEMBLrel. 19, Last annotation update) 
Membrane -bound folate binding protein. 
Sus scrofa (Pig) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 
Mammalia; Eutheria; Cetart iodactyla ; Suina; Suidae; 
NCBI_TaxID=9823; 
[1] 

SEQUENCE FROM N . A. 
TISSUE=Placenta; 

Vallet J.L., Smith T.P.L., Sontegard T. , Pearson P.L., 
Christenson R.K., Klemcke H.G. ; 

"Isolation of cDNAs encoding putative secreted and membrane -bound 

folate binding proteins from endometrium of swine."; 

Biol. Reprod. 0:0-0(1999). 

EMBL; AF137374; AAD33741.1; 

InterPro; IPR004269; Folate_rec . 

Pfam; PF03024; Folate_rec; 1. 

SEQUENCE 249 AA; 28755 MW; 17FAAF2 001D6B42 0 CRC64 ; 



Query Match 64.2%; 
Best Local Similarity 71.4%; 
Matches 5; Conservative 

Qy 1 CENWWGD 7 

hill I 
Db 131 CQNWWED 13 7 



Score 43; DB 6; 
Pred. No. 25; 
1; Mismatches 



Length 24 9; 
1; Indels 



0 ; Gaps 



RESULT 5 
Q8IVQ8 



ID 

AC 
DT 
DT 
DT 
DE 
OS 
OC 



Q8IVQ8 
Q8IVQ8; 
01-MAR-2003 
01-MAR-2003 
01-MAR-2003 



PRELIMINARY; 



PRT; 1008 AA. 



(TrEMBLrel. 23, 
(TrEMBLrel . 23, 
(TrEMBLrel. 23, 
Likely ortholog of mouse export in 4 
Homo sapiens (Human) . 

Eukaryota; Metazoa; Chordata; Craniata; 



Created) 

Last sequence update) 
Last annotation update) 
(Fragment) . 



Vertebrata; Euteleostomi ; 



OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBIJTaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Eye; 

RA Strausberg R. ; 

RL Submitted (JAN-2003) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; BC042504 ; AAH42504.1; -. 

FT NON_TER 1 1 

SQ SEQUENCE 1008 AA; 114035 MW; 9C94B22 51B9744 94 CRC64; 

Query Match 62.7%; Score 42; DB 4; Length 1008; 

Best Local Similarity 100.0%; Pred. No. 1.5e+02; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 CENWW 5 

Mill 

Db 58 0 CENWW 584 



RESULT 6 
093877 

ID 093877 PRELIMINARY; PRT; 240 AA. 

AC 093877; 

DT 01-MAY-1999 (TrEMBLrel . 10, Created) 

DT 01-MAY-1999 (TrEMBLrel. 10, Last sequence update) 

DT 01-JUN-2002 (TrEMBLrel. 21, Last annotation update) 

DE Pectate lyase. 

OS Fusarium oxysporum f. sp. lycopersici. 

OC Eukaryota; Fungi; Ascomycota; Pezizomycotina ; Sordariomycetes ; 

OC Hypocreales; mitosporic Hypocreales; Fusarium. 

OX NCBI_TaxID=59765; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=42-87; 

RX MEDLINE-99146871; PubMed=10 022 947 ; 

RA Huertas-Gonzalez M.D. , Ruiz-Roldan M.C., Garcia Maceira F.I., 

RA Roncero M.I., Di Pietro A.; 

RT "Cloning and characterization of pll encoding an in planta -secreted 

RT pectate lyase of Fusarium oxysporum."; 

RL Curr. Genet. 35:36-40(1999). 

DR EMBL; AF080485; AAC64368.1; -. 

DR InterPro; IPR004898; Pect__lyase. 

DR Pfam; PF03211; Pectate_lyase ; 1. 

KW Lyase. 

SQ SEQUENCE 240 AA; 24859 MW; 46D4B2 973 05 006B1 CRC64 ; 

Query Match 61.2%; Score 41; DB 3; Length 24 0; 

Best Local Similarity 83.3%; Pred. No. 51; 

Matches 5; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 4 WWGDVC 9 

II Ml 

Db 100 WWADVC 105 



RESULT 7 



Q04701 



ID Q04701 PRELIMINARY; PRT; 242 AA. 

AC Q04701; 

DT 01-NOV-1996 (TrEMBLrel . 01, Created) 

DT 01-NOV-1996 (TrEMBLrel. 01, Last sequence update) 

DT 01-JUN-2002 (TrEMBLrel. 21, Last annotation update) 

DE Pectate lyase A precursor (EC 4.2.2.2). 

GN PELA. 

OS Fusarium solani (subsp. pisi) . 

OC Eukaryota; Fungi; Ascomycota; Pezizomycotina; Sordariomycetes ; 

OC Hypocreales; Nectriaceae; Nectria. 

OX NCB I _Tax ID=70791; 
RN [1] 

RP SEQUENCE FROM N.A., AND SEQUENCE OF 18-55 AND 69-88. 

RX MEDLINE-93015682; PubMed=14 0 0187 ; 

RA Gonzalez-Candelas L., Kolattukudy P.E.; 

RT "Isolation and analysis of a novel inducible pectate lyase gene from 

RT the phytopathogenic fungus Fusarium solani f . sp. pisi (Nectria 

RT haematococca, mating population VI) ."; 

RL J. Bacteriol. 174:6343-6349(1992). 
RN [2] 

RP SEQUENCE FROM N.A., AND SEQUENCE OF 21-25. 

RX MEDLINE-88022783; PubMed=33 108 98 ; 

RA Crawford M.S., Kolattukudy P . E . ; 

RT "Pectate lyase from Fusarium solani f. sp. pisi: purification, 

RT characterization, in vitro translation of the mRNA, and involvement in 

RT pathogenicity. " ; 

RL Arch. Biochem. Biophys . 258:196-205(1987). 

CC -!- FUNCTION: INVOLVED IN MACERATION AND SOFT-ROTTING OF PLANT TISSUE. 

CC -!- CATALYTIC ACTIVITY: ELIMINATIVE CLEAVAGE OF PECTATE TO GIVE 
CC OLIGOSACCHARIDES WITH 4 -DEOXY-ALPHA-D-GLUC-4 -ENURONOSYL GROUPS AT 

CC THEIR NON-REDUCING ENDS. 

CC -!- ENZYME REGULATION: SUBJECT TO SELF CATABOLITE REPRESSION. 

CC -!- SUBCELLULAR LOCATION: EXTRACELLULAR. 

CC -!- INDUCTION: BY PECTIN. 

CC -!- DISEASE: PECTATE LYASES HAVE BEEN IMPLICATED AS PATHOGENICITY 

CC FACTORS WHICH INDUCE MACERATION OR ROTTING OF PLANT TISSUE. 

CC -!- SIMILARITY: BELONGS TO THE P LADES FAMILY OF EXTRACELLULAR PELS . 

CC SIMILAR TO THE PLBC PROTEINS. 

DR EMBL; M94691; AAA33338.1; 

DR EMBL; M94692; AAA33339.1; 

DR InterPro; IPR004898; Pect_lyase. 

DR Pfam; PF03211; Pectate__lyase ; 1. 

KW Lyase; Multigene family; Signal; Glycoprotein. 

FT SIGNAL 1 17 

FT CHAIN 18 242 PECTATE LYASE A. 

FT CARBOHYD 215 215 N -LINKED ( GLCNAC . . .) (POTENTIAL) . 

SQ SEQUENCE 242 AA; 25339 MW; 3F338FBE895AB286 CRC64 ; 



Query Match 61.2%; Score 41; DB 3; Length 242; 

Best Local Similarity 83.3%; Pred. No. 52; 

Matches 5; Conservative 0; Mismatches 1; Indels 0; Gaps 

Qy 4 WWGDVC 9 

II Ml 

Db 102 WWADVC 107 



RESULT 8 
Q00851 

ID Q00851 PRELIMINARY; PRT; 244 AA. 

AC Q00851; 

DT 01-NOV-1996 (TrEMBLrel . 01, Created) 

DT 01-NOV-1996 (TrEMBLrel . 01, Last sequence update) 

DT 01-JUN-2002 (TrEMBLrel. 21, Last annotation update) 

DE Pectate lyase B. 

GN PELB . 

OS Nectria haematococca . 

OC Eukaryota; Fungi; Ascomycota; Pezizomycotina ,- Sordariomycetes ; 

OC Hypocreales; Nectriaceae; Nectria. 

OX NCBI_TaxID=14 0110; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=T8 ; 

RX MEDLINE=96099288; PubMed=8522511 ; 

RA Guo W., Gonzalez-Candelas L. , Kolattukudy P.E.; 

RT "Cloning of a novel constitutively expressed pectate lyase gene pelB 

RT from Fusarium solani f. sp . pisi (Nectria haematococca, mating type 

RT VI) and characterization of the gene product expressed in Pichia 

RT pastoris." ; 

RL J . Bacterid. 177:7070-7077 (1995). 

DR EMBL; U13051; AAA87383.1; -. 

DR InterPro; IPR004898; Pect_lyase. 

DR Pfam; PF03211; Pectate_lyase ; 1. 

KW Lyase. 

SQ SEQUENCE 244 AA; 25663 MW; BF8044 13A6546469 CRC64 ; 

Query Match 61.2%; Score 41; DB 3; Length 244; 

Best Local Similarity 83.3%; Pred. No. 52; 

Matches 5; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 4 WWGDVC 9 

II III 

Db 103 WWADVC 108 

RESULT 9 
Q9GKT2 

ID Q9GKT2 PRELIMINARY; PRT; 333 AA. 

AC Q9GKT2 ; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last annotation update) 

DE Hypothetical 38.9 kDa protein. 

OS Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Cercopithecidae; 

OC Cercopithecinae; Macaca. 

OX NCBI__TaxID=9541; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Cerebellum; 

RA Osada N. , Hida M. , Kusuda J., Tanuma R., Iseki K. , Hirai M. , Terao K. , 

RA Suzuki Y., Sugano S., Hashimoto K. ; 



RT "Isolation of full-length cDNA clones from macaque brain cDNA 

RT libraries."; 

RL Submitted (DEC-2000) to the EMBL/GenBank / DDB J databases. 

DR EMBL; AB052147; BAB19002.1; -. 

DR InterPro; IPR006087; Sterol_desat . 

DR InterPro; IPR006088; Sterol_desatur . 

DR Pfam; PF01598; Sterol_desat ; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 333 AA; 38885 MW; B9 9755D8433 12A06 CRC64 ; 

Query Match 61.2%; Score 41; DB 6; Length 333; 

Best Local Similarity 83.3%; Pred. No. 71; 

Matches 5; Conservative 0; Mismatches 1; Indels 0; Gaps 

Qy 4 WWGDVC 9 

MM I 

Db 156 WWGDPC 161 



RESULT 10 
Q9BGQ4 



ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
OC 
OX 
RN 
RP 
RC 
RA 
RA 
RT 
RT 
RL 
DR 
DR 
DR 
DR 
KW 
SQ 



Q9BGQ4 
Q9BGQ4 ; 
01-JUN-2001 
01-JUN-2001 
01-OCT-2002 



PRELIMINARY; 



PRT; 



333 AA. 



(TrEMBLrel. 17 
(TrEMBLrel. 17 
(TrEMBLrel. 22 
Hypothetical 38.9 kDa protein. 
Macaca fascicular is (Crab eating macaque 



Created) 

Last sequence update) 
Last annotation update) 



(Cynomolgus monkey) . 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Cercopithecidae; 



Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates ; 
Cercopithecinae; Macaca . 
NCBI_TaxID=9541; 
[1] 

SEQUENCE FROM N . A . 
TISSUE=Frontal cortex; 

Osada N. , Hida M., Kusuda J., Tanuma R. , Iseki K. , Hirai M. , Terao 
Suzuki Y., Sugano S., Hashimoto K. ; 

"Isolation of full-length cDNA clones from macaque brain cDNA 
libraries . " ; 

Submitted (FEB-2001) to the EMBL/ GenBank/DDB J databases. 

EMBL; AB056418; BAB33 076.1; 

InterPro; IPR006087 ; Sterol_desat . 

InterPro; IPR006088 ; Sterol__desatur . 

Pfam; PF01598; Sterol_desat ; 1. 

Hypothetical protein. 

SEQUENCE 333 AA; 38925 MW; 30795B28433 138B1 CRC64 ; 



Query Match 61.2%; Score 41; DB 6; 

Best Local Similarity 83.3%; Pred. No. 71; 
Matches 5; Conservative 0; Mismatches 

Qy 4 WWGDVC 9 

MM I 

Db 156 WWGDPC 161 



Length 333; 
1; Indels 



0 ; Gaps 



RESULT 11 



Q2193 
ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OX 
RN 
RP 
RA 
RL 
RN 
RP 
RX 
RA 
RT 
RT 
RL 
DR 
DR 
DR 
DR 
DR 
SQ 



Created) 

Last sequence update) 
Last annotation update) 



8 

Q21938 PRELIMINARY; PRT; 401 AA. 

Q21938; 

0 1 -NOV- 1996 ( TrEMBLr el . 01, 
01-MAR-2002 (TrEMBLr el . 20, 
01-MAR-2003 (TrEMBLrel . 23, 
R11D1 , 10 protein. 
R11D1 . 10 . 

Caenorhabditis elegans. 

Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 
Rhabditidae; Peloderinae; Caenorhabditis . 
NCBI_TaxID=623 9; 
[1] 

SEQUENCE FROM N.A. 
Steward C. A. ; 

Submitted (JUN-1996) to the EMBL/ GenBank/DDB J databases, 
[2] 

SEQUENCE FROM N.A. 

MEDLINE-99069613 ; PubMed=9851916 ; 
none ; 

"Genome sequence of the nematode C. elegans: A platform for 

investigating biology. " ; 

Science 282:2012-2018(1998). 

EMBL; Z75547; CAA99905.3; -. 

WormPep; R11D1.10; CE28 948. 

InterPro; I PRO 003 06; Znf_FYVE. 

SMART; SM00064; FYVE; 1. 

PROSITE; PS 5 017 8; ZF_FYVE ; 1. 

SEQUENCE 401 AA; 46304 MW; FE45918C5DB1E755 CRC64 ; 



Query Match 61.2%; 
Best Local Similarity 83.3%; 
Matches 5; Conservative 



Score 41; DB 5; 
Pred. No. 86; 
1; Mismatches 



Length 401; 



0; Indels 



0 ; Gaps 



Qy 

Db 



2 ENWWGD 7 

HUM 
91 KNWWGD 96 



RESULT 
Q8PVY9 
ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 



12 



OC 
OX 
RN 
RP 
RC 
RX 
RA 



Created) 

Last sequence update) 
Last annotation update) 



Q8PVY9 PRELIMINARY; PRT; 425 AA. 

Q8PVY9; 

01-OCT-2002 (TrEMBLrel . 22, 
01-OCT-2002 (TrEMBLrel- 22, 
01-MAR-2003 (TrEMBLrel. 23, 
Conserved protein. 
MM1816 . 

Methanosarcina mazei (Methanosarcina frisia) . 
Archaea ; Euryarchaeota ; Methanococci ; Methanosarcinales ; 
Methanosarcinaceae; Methanosarcina . 
NCBI_TaxID=2209; 
[1] 

SEQUENCE FROM N.A. 

STRAIN=Goel / Gol / ATCC BAA-199 / DSM 3647 / OCM 88; 
MEDLINE=22120827; PubMed=12 125824 ; 

Deppenmeier U. , Johann A., Hartsch T., Merkl R. , Schmitz R. 



RA Martinez -Arias R., Henne A. , Wiezer A. , Baeumer S., Jacobi C. , 

RA Brueggemann H., Lienard T . , Christmann A., Boemecke M . , Steckel S., 

RA Bhattacharyya A., Lykidis A., Overbeek R. , Klenk H.-P., Gunsalus R.p., 

RA Fritz H.-J., Gottschalk G.; 

RT "The genome of Methanosarcina mazei : evidence for lateral gene 

RT transfer between Bacteria and Archaea." ; 

RL J. Mol . Microbiol. Biotechnol . 4:453-461(2002). 

DR EMBLi ; AE013418; AAM31512 . 1 ; 

DR InterPro; IPR006457; S_JLayer_rel_Mac . 

DR TIGRFAMs; TIGR01567; S_layer_rel_Mac ; 1. 

KW Complete proteome. 

SQ SEQUENCE 425 AA; 46508 MW; 4826E1542DF69D97 CRC64 ; 

Query Match 61.2%; Score 41; DB 17; Length 425; 

Best Local Similarity 83.3%; Pred. No. 91; 

Matches 5; Conservative 1; Mismatches 0; Indels 0; Gaps 0; 

Qy 2 ENWWGD 7 

llllh 

Db 71 ENWWGE 76 

RESULT 13 
Q8PPY7 

ID Q8PPY7 PRELIMINARY; PRT; 735 AA. 

AC Q8PPY7; 

DT 01-OCT-2002 (TrEMBLrel . 22, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last annotation update) 

DE Hypothetical protein XAC0546. 

GN XAC0546. 

OS Xanthomonas axonopodis (pv. citri) . 

OC Bacteria; Proteobacteria; Gammaproteobacteria; Xanthomonadales ; 

OC Xanthomonadaceae; Xanthomonas. 

OX NCBI_TaxID=9282 9; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=306 / ATCC 13902 / XV 101; 

RX MEDLINE=22022145; PubMed=12024217 ; 

RA da Silva A.C.R., Ferro J. A. , Reinach F.C., Farah C.S., Furlan L.R., 

RA Quaggio R.B., Monteiro-Vitorello C.B., Van Sluys M.A. , Almeida N.F., 

RA Alves L.M.C., do Amaral A.M., Bertolini M.C., Camargo L.E.A., 

RA Camarotte G., Cannavan F., Cardozo J . , Chambergo F., Ciapina L.P., 

RA Cicarelli R.M.B., Coutinho L.L., Cursino-Santos J.R. , El-Dorry H. , 

RA Faria J.B., Ferreira A.J.S., Ferreira R.C.C., Ferro M.I.T., 

RA Formighieri E.F., Franco M.C., Greggio C.C., Gruber A., 

RA Katsuyama A.M., Kishi L.T. , Leite R.P., Lemos E.G.M., Lemos M.V.F., 

RA Locali E.C., Machado M.A., Madeira A. M.B.N,, Martinez-Rossi N.M., 

RA Martins E.C., Meidanis J., Menck C.F.M., Miyaki C.Y. , Moon D . H . 

RA Moreira L.M. , Novo M.T.M., Okura V.K., Oliveira M.C., Oliveira V.R., 

RA Pereira H.A. , Rossi A., Sena J.A.D., Silva C. , de Souza R.F., 

RA Spinola L.A.F., Takita M.A. , Tamura R.E., Teixeira E.C., Tezza R.I.D., 

RA Trindade dos Santos M . , Truffi D. , Tsai S.M., White F.F., 

RA Setubal J.C., Kitajima J. P.; 

RT "Comparison of the genomes of two Xanthomonas pathogens with differing 

RT host specificities."; 

RL Nature 417:459-463(2002). 



DR 
KW 
SQ 



EMBL; AE011681; AAM3 5435.1; -. 

Hypothetical protein; Complete proteome. 

SEQUENCE 735 AA; 80474 MW; 8D5E23E1DF99 1D69 CRC64; 



Query Match 61.2%; 
Best Local Similarity 83.3%; 
Matches 5; Conservative 



Score 41; DB 16; Length 735; 
Pred. No. 1.6e+02; 
1; Mismatches 0; Indels 



0 ; Gaps 



Qy 
Db 



3 NVJWGDV 8 

llllh 
593 NWWGDL 598 



RESULT 14 
Q8TJY7 
ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OX 
RN 
RP 
RC 
RX 
RA 
RA 



RA 
RA 
RA 
RA 
RA 
RA 
RA 
RA 
RA 
RT 
RT 
RL 
DR 
DR 
DR 
KW 
SQ 



Created) 

Last sequence update) 
Last annotation update) 



Methanosarcinales ; 



Q8TJY7 PRELIMINARY; PRT; 744 AA. 

Q8TJY7; 

01-JUN-2002 (TrEMBLrel. 21, 
01-JUN-2002 (TrEMBLrel. 21, 
01-MAR-2003 (TrEMBLrel. 23, 
Predicted protein. 
MA3639. 

Methanosarcina acetivorans. 
Archaea ; Euryarchaeota ; Methanococci ; 
Methanosarcinaceae; Methanosarcina . 
NCBIJTaxID-2214; 
[1] 

SEQUENCE FROM N . A. 

STRAIN=C2A / ATCC 3 53 95 / DSM 2834; 
MEDLINE-21929760; PubMed-11932238 ; 

Galagan J.E., Nusbaum C. , Roy A. , Endrizzi M.G. , Macdonald P., 
FitzHugh W., Calvo S., Engels R. , Smirnov S., Atnoor D. , Brown A. , 
Allen N . , Naylor J., Stange-Thomann N. , DeArellano K. , Johnson R. , 
Linton L. , McEwan P., McKernan K. , Talamas J., Tirrell A., Ye W. , 
Zimmer A. , Barber R.D., Cann I., Graham D.E., Grahame D.A., Guss A.M. t 
Hedderich R., Ingram- Smith C. , Kuettner H.C., Krzycki J. A., 
Leigh J. A., Li W, , Liu J., Mukhopadhyay B., Reeve J.N. , Smith K. , 
Springer T.A. , Umayam L.A., White 0., White R.H., de Macario E.G., 
Ferry J.G., Jarrell K.F., Jing H. , Macario A.J.L., Paulsen I., 
Pritchett M . , Sowers K.R., Swanson R.V. , Zinder S.H., Lander E., 
Metcalf W.W W Birren B. ; 

"The genome of Methanosarcina acetivorans reveals extensive metabolic 

and physiological diversity."; 

Genome Res. 12:532-542(2002). 

EMBL; AE011072; AAM06994.1; 

I nt er Pro ; I PRO 0 6 4 5 7 ; S_l ayer_r e l_Mac . 

TIGRFAMs; TIGR01567; S_layer_rel_Mac ; 2. 

Complete proteome. 

SEQUENCE 744 AA; 83248 MW; 3E822 964E71D5C04 CRC64 ; 



Query Match 61 . 2%; 

Best Local Similarity 83.3%; 
Matches 5; Conservative 



Score 41; DB 17; Length 744; 
Pred. No. 1.6e+02; 
1; Mismatches 0; Indels 



0 ; Gaps 



Qy 

Db 



2 ENWWGD 7 

llllh 
38 9 ENWWGE 3 94 



RESULT 15 
014597 



ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
OX 
RN 
RP 
RC 
RA 
RT 
RT 
RL 
DR 
DR 
DR 
FT 
FT 
SQ 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



014597 PRELIMINARY; PRT; 85 AA. 

014597; 

01-JAN-1998 (TrEMBLrel. 05, Created) 
01-DEC-2001 (TrEMBLrel . 19, Last sequence update) 
01-OCT-2002 (TrEMBLrel . 22, Last annotation update) 
Folate binding protein (Fragment) , 
Homo sapiens (Human) . 
Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
NCBIJTaxID=9606; 
[1] 

SEQUENCE FROM N . A . 
TISSUE=Salivary gland; 
Verma R.S., Elwood P.C.; 

"Identification and characterization of homologous cDNA to KB folate 
receptor from human salivary gland. "; 

Submitted (OCT-2001) to the EMBL/GenBank/DDBJ databases. 

EMBL; AF000381; AAB81938.2; 

InterPro; IPR004269; Folate__rec. 

Pfam; PF03024; Folate_rec; 1. 

NONJTER 1 1 

NONJTER 85 85 

SEQUENCE 85 AA; 9995 MW; E3 9A8454EE3 96A76 CRC64; 



Query Match 59.7%; 
Best Local Similarity 71.4%; 
Matches 5; Conservative 



Score 40; DB 4; 
Pred. No. 26; 
0; Mismatches 



Length 85; 
2; Indels 



0 ; Gaps 



0; 



Qy 

Db 



1 CENWWGD 7 

II II I 
36 CEQWWED 42 



Search completed: November 13, 2003, 09:50:57 
Job time : 25.7188 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



November 13, 2003, 09:31:40 ; Search time 30.2812 Seconds 

(without alignments) 
47.176 Million cell updates/sec 

US-09-228-866-3 
49 

1 CLSSRLDAC 9 
BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 1107863 seqs, 158726573 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 
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Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 



and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
AAW13412 

ID AAW13412 standard; Peptide; 9 AA. 
XX 

AC AAW13412; 
XX 

DT 15-JAN-1998 (first entry) 
XX 

DE Brain homing peptide. 
XX 

KW Brain homing peptide; in vivo panning; screening; phage display; 

KW drug delivery. 

XX 

OS Synthetic. 
XX 

PN WO9710507-A1. 
XX 

PD 20-MAR-1997. 
XX 

PF 10-SEP-1996; 96WO-US14600 . 
XX 

PR ll-SEP-1995; 95US- 052 67 10 . 

PR ll-SEP-1995; 95US- 05267 08 . 
XX 

PA (LJOL-) LA JOLLA CANCER RES FOUND. 
XX 

PI Pasqualini R, Ruoslahti E; 
XX 

DR WPI; 1997-202359/18. 
XX 

PT Obtaining compound that homes to selected organ or tissue - by in 

PT vivo panning method, specifically to identify brain, kidney, 

PT angiogenic vasculature or tumour tissue homing peptide (s) 
XX 

PS Claim 10; Page 67; 75pp; English. 
XX 

CC This synthetic peptide is a claimed example of a brain-homing 

CC peptide that was identified using a novel method for obtaining 

CC molecules that home to a selected organ or tissue. This in vivo 

CC panning method typically involves administering a phage display 

CC library to a subject, and identifying expressed peptides which 

CC home to the desired organ or tissue, e.g. brain, kidney, angiogenic 

CC vascular tissue or tumour tissue. The isolated peptides (see 

CC AAW13412-52, AAW11181-86) can be used to target e.g. drugs, toxins or 

CC labels to the selected organ/tissue (claimed) or to identify and/or 

CC isolate target molecules (claimed) . The peptides can be directly 

CC identified in vivo, as compared to prior art in vitro screening 

CC methods, which require further examination to see if they maintain 

CC specificity in vivo. 

XX 

SQ Sequence 9 AA; 

Query Match 100.0%; Score 49; DB 18; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CLSSRLDAC 9 



Db 1 CLSSRLDAC 9 



97US-0862855. 

95US-0526710. 
97US-0813273. 



RESULT 2 
AAB07389 

ID AAB07389 standard; peptide; 9 AA. 
XX 

AC AAB07389; 
XX 

DT 17-OCT-2000 (first entry) 
XX 

DE Brain homing peptide # 3. 
XX 

KW Brain; homing peptide; organ targeting; tissue targeting; mouse; cyclic. 
XX 

OS Mus sp. 
XX 

FH Key Location/Qualifiers 
FT Disulf ide-bond 1..9 

FT /note= "Can optionally form a cyclic peptide" 

XX 

PN US6068829-A. 
XX 

PD 30-MAY-2000. 
XX 

PF 23-JUN-1997; 
XX 

PR ll-SEP-1995; 
PR 10-MAR-1997; 
XX 

PA (BURN-) BURNHAM INST . 
XX 

PI Pasqualini R, Ruoslahti E; 
XX 

DR WPI; 2000-410850/35. 
XX 

PT Identifying and recovering organ homing molecules or peptides by in 
PT vivo panning comprises administering a library of diverse peptides 
PT linked to a tag which facilitates recovery of these peptides 
XX 

PS Example 2; Column 17; 20pp; English. 
XX 

CC The present sequence is a mouse brain homing peptide. This sequence was 

CC identified by using in vivo panning to screen a library of potential 

CC organ homing molecules. The present sequence can be used to direct a 

CC moiety to a the brain tissue, by linking the moiety to the present 

CC sequence. Examples of potential moieties are drugs, toxins or a 

CC detectable label. The present sequence contains a SRL amino acid motif. 

XX 

SQ Sequence 9 AA; 

Query Match 100.0%; Score 49; DB 21; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 
Qy 1 CLSSRLDAC 9 



MINIMI 

Db 1 CLSSRLDAC 9 

RESULT 3 
AAE11795 

ID AAE11795 standard; peptide; 9 AA. 
XX 

AC AAE11795; 
XX 

DT 18-DEC-2001 (first entry) 
XX 

DE Phage peptide #3 targetted to brain. 
XX 

KW Enriched library fraction; brain; kidney; tumour; panning; diagnostic; 
KW molecular medicine; drug delivery; peptidomimetic; pharmaceutical. 
XX 

OS Bacteriophage. 
XX 

FH Key Location/Qualifiers 

FT Domain 4 . . 6 

FT /label = SRLjnotif 

XX 

PN US6296832-B1. 
XX 

PD 02-OCT-2001. 
XX 

PF 08-JAN-1999; 99US-0226985 . 
XX 

PR 23-JIM-1997; 97US- 0862855 . 
PR ll-SEP-1995; 95US- 0526710 . 
PR 10-MAR-1997; 97US- 08 132 73 . 
XX 

PA (BURN- ) BURNHAM INST. 
XX 

PI Ruoslahti E, Pasqualini R; 
XX 

DR WPI; 2001-610691/70. 
XX 

PT Enriched library fraction comprising molecules recovered by in vivo 
PT panning that selectively home to a selected organ or tissue useful for 
PT treating disease or in diagnostic methods 
XX 

PS Example 2; Column 17; 21pp; English. 
XX 

CC The invention relates to an enriched library fraction containing 

CC molecules that selectively home to a selected organ or tissue such as 

CC brain, kidney or tumour recovered by in vivo panning. The invention 

CC generally relates to the field of molecular medicine, drug delivery and 

CC to a method of invivo panning for identifying a molecule that homes to a 

CC specific organ. The molecules, e.g., peptides, peptidomimetics , proteins 

CC and fragments of proteins contained in an enriched library fraction may 

CC be administered to a subject as part of a pharmaceutical composition to 

CC treat disease or in diagnostic methods. The present sequence is a 

CC peptide from bacteriophage targetted to brain. 

XX 

SQ Sequence 9 AA; 



Query Match 100.0%; Score 49; DB 22; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 
Matches 9; Conservative 0; Mismatches 0; Indels 



0; Gaps 0 



Qy 1 CLSSRLDAC 9 

MINIMI 

Db 1 CLSSRLDAC 9 

RESULT 4 
AAU10706 

ID AAU10706 standard; peptide; 9 AA . 
XX 

AC AAU10706; 
XX 

DT 12-MAR-2002 (first entry) 
XX 

DE Brain homing peptide #3 useful for delivery of target molecules. 
XX 

KW Organ targeting; tissue targeting; cancer; tumour homing molecule; 

KW delivery of target molecule; brain homing peptide. 

XX 

OS Synthetic. 
XX 

PN US6306365-B1. 
XX 

PD 23-OCT-2001. 
XX 

PF 08-JAN-1999; 99US-0227906 . 
XX 

PR 23-JUN-1997; 97US- 08 62855 . 

PR ll-SEP-1995; 95US- 052671 0 . 

PR 10-MAR-1997; 97US- 08 13273 . 
XX 

PA (BURN-) BURN HAM INST. 
XX 

PI Ruoslahti E, Pasqualini R; 
XX 

DR WPI; 2002-040196/05. 
XX 

PT Recovering molecules that home to an organ or tissue, useful for 

PT identifying molecules that home to a specific organ or tissue, e.g. 

PT identifying a tumour homing molecule to identify the presence of cancer, 

PT by in vivo panning of a library - 

XX 

PS Example 2; Column 17; 21pp; English. 
XX 

CC The present invention relates to a method of recovering molecules that 

CC home to a selected organ or tissue. The method comprises administering 

CC to the subject the library of diverse molecules, collecting a sample of 

CC the selected organ or tissue (e.g. brain or kidney), and recovering from 

CC the sample several molecules that home to the selected organ or tissue. 

CC The method is useful for identifying molecules, particularly useful for 

CC screening large number of molecules (e.g. peptides) , that home to a 

CC specific organ. The identified molecule is useful for e.g. raising an 

CC antibody specific for a target molecule, targeting a desired moiety 



CC (e.g. drug, toxin or detectable label) to the selected organ. 

CC Specifically, the method is useful for identifying the presence of cancer 

CC in a subject by linking an appropriate moiety to a tumour homing 

CC molecule. The present method provides a direct means for identifying 

CC molecules that specifically home to a selected organ and, therefore 

CC provides a significant advantage over previous methods, which require 

CC that a molecule identified using an in vitro screening method 

CC subsequently be examined to determine if it maintains its specificity in 

CC vivo. AAU10704-AAU10723 represent brain homing peptides described in 

CC the present invention. 

XX 

SQ Sequence 9 AA; 

Query Match 100.0%; Score 49; DB 23; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CLSSRLDAC 9 

MINIMI 

Db 1 CLSSRLDAC 9 



RESULT 5 
ABU59532 

ID ABU59532 standard; Peptide; 9 AA. 
XX 

AC ABU59532; 
XX 

DT 22-APR-2003 (first entry) 
XX 

DE Brain receptor targeting peptide #4. 
XX 

KW Targeting ligand; bioactive agent; polymer matrix; cancer; cytostatic; 

KW cathepsin-D substrate; peptides; neuroreceptor; adrenal receptor; 

KW fibronectin; vitronectin; integrin; RGD motif; angiogenic endothelium; 

KW tumour; cationic cancer- targeting peptide . 

XX 

OS Synthetic. 
XX 

PN US2002041898-A1. 
XX 

PD ll-APR-2002. 
XX 

PF 25-JUL-2001; 2 001US-0912609 . 
XX 

PR 05-JAN-2000; 2000US-0478124 . 
PR 31-OCT-2000; 2000US-0703474 . 
XX 

PA (UNGE/) UNGER E C. 

PA (MATS/) MATSUNAGA T O. 

PA (RAMA/) RAMASWAMI V. 

PA (ROMA/) ROMANOWSKI M J. 

XX 

PI Unger EC, Matsunaga TO, Ramaswami V, Romanowski MJ; 
XX 

DR WPI; 2003-208921/20. 
XX 



PT Targeted delivery system comprising a bioactive agent homogeneously 

PT dispersed in a targeted matrix is especially useful in cancer therapy 
PT 
XX 

PS Claim 23; Page 37; 46pp; English. 
XX 

CC The invention relates to a composition comprising a bioactive agent 

CC homogeneously dispersed in a targeted matrix (polymer and targeting 

CC ligand) . Also included are a targeted matrix for use as a delivery 

CC vehicle comprising a polymer associated with a targeting ligand, 

CC enhancing the bioavailability of an agent comprising administration 

CC of the composition and treating cancer comprising administration of the 

CC novel composition. The method is useful for targeted delivery of a drug, 

CC especially in cancer therapy. The targeting ligand may be a peptide. 

CC Examples of targeting peptides are disclosed including cathepsin-D 

CC substrate peptides, peptides targeting receptors in the brain and 

CC kidney, peptides recognising fibronectin- and vitronectin-binding 

CC integrins, peptides targeting the RGD (Arg-Gly-Asp) -motif in, e.g., 

CC antibodies, peptides targeting the angiogenic endothelium of solid 

CC tumours, tissue specific peptides (e.g. of lung, skin, pancreas, 

CC intestine, uterus, adrenal gland and retina) , and cationic cancer- 

CC targeting peptides. The present sequence is a peptide targeting 

CC ligand disclosed in the invention. 

XX 

SQ Sequence 9 AA; 

Query Match 100.0%; Score 49; DB 24; Length 9; 

Best Local Similarity 100.0%; Pred. No. 9.3e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 1 CLSSRLDAC 9 

Illllllll 
Db 1 CLSSRLDAC 9 



RESULT 6 
AAB26819 

ID AAB26819 standard; peptide; 20 AA. 
XX 

AC AAB26819; 
XX 

DT 23-JAN-2001 (first entry) 
XX 

DE Peptidic membrane binding element. 
XX 

KW Organ perfusion; transplantation; storage; antiinflammatory; 
KW immunosuppressive; vasotropic; complement activation inhibitor; 
KW allograft rejection; ischaemia reperfusion injury. 
XX 

OS Synthetic. 
XX 

FH Key Location/Qualifiers 
FT Modified-site 1 

FT /note* "Optionally N-Myristoyl-Gly" 

FT Modified-site 20 

FT /note= "Optionally S-2-Thiopyridyl ~Cys-NH2" 

XX 



PN WO200053007-A1 . 
XX 

PD 14-SEP-2000. 
XX 

PF 08-MAR-2000; 2000WO-GB00834 . 
XX 

PR 10-MAR-1999; 99GB- 0005503 . 
XX 

PA (ADPR-) ADPROTECH LTD. 
XX 

PI Smith RAG, Pratt JR, Sacks SH; 
XX 

DR WPI; 2000-601920/57. 
XX 

PT Preparation for perfusing organ prior to transplantation or storage 

PT comprises soluble derivative of a soluble polypeptide which comprises 

PT two heterologous membrane binding elements with low membrane affinity 
PT 
XX 

PS Example 2; Page 20; 47pp; English. 
XX 

CC The present invention relates to formulations and preparations for 

CC perfusing an organ prior to transplantation or storage. The preparation 

CC comprises a soluble derivative or a polypeptide, which has two or more 

CC heterologous membrane binding elements. The membrane binding elements are 

CC capable of interacting, independently and with thermodynamic additivity, 

CC with membrane components of the organ exposed to extracellular perfusion 

CC fluids, and a flush storage solution. The preparation exhibits 

CC antiinflammatory, immunosuppressive and vasotropic activity and works as 

CC a complement activation inhibitor and an inhibitor of cytotoxic T 

CC lymphocyte activity. The preparation is used for preparing an organ prior 

CC to transplantation or storage and for prevention, treatment or 

CC amelioration of a disease or disorder associated with inflammation, 

CC inappropriate complement activation or inappropriate activation of 

CC coagulant or thrombotic processes prior to, during or after 

CC transplantation or storage of an organ. The preparation is useful for 

CC treating hyperacute and acute allograft rejection of transplanted organs 

CC such as kidney, heart, liver or lungs, ischaemia-reperfusion injury in 

CC transplanted organs, xenograft rejection and corneal graft rejection. The 

CC present sequence represents a peptidic membrane binding element used in 

CC an example of the preparation of the invention. 

XX 

SQ Sequence 2 0 AA; 

Query Match 81.6%; Score 40; DB 21; Length 20; 

Best Local Similarity 100.0%; Pred. No. 4.1; 

Matches 8; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 2 LSSRLDAC 9 

MINIM 

Db 13 LSSRLDAC 20 



RESULT 7 
AAW77731 

ID AAW77731 standard; Protein; 130 AA. 
XX 



AC AAW77731; 
XX 

DT 30-OCT-1998 (first entry) 
XX 

DE Exou protein. 
XX 

KW Staphylococcus aureus protein; immune response induction; eye infection; 

KW antibody production; T-cell immune response; gastrointestinal infection; 

KW respiratory infection; inhibitor; bacterial infection; cardiac infection; 

KW central nervous system; kidney infection; urinary tract infection; 

KW antimicrobial compound identification; broad spectrum antibiotic; 

KW therapy . 

XX 

OS Staphylococcus aureus . 
XX 

PN EP841394-A2. 
XX 

PD 13-MAY-1998. 
XX 

PF 24-SEP-1997; 97EP-0307485 . 
XX 

PR 24-SEP-1996; 96US-0027032 . 
XX 

PA (SMIK ) SMITHKLINE BEE CHAM CORP. 

PA (SMIK ) SMITHKLINE BEECHAM PLC . 
XX 

PI Black MT, Burnham MKR, Hodgson JE # Knowles DJC; 

PI Lonetto MA, Nicholas RO, Pratt JM, Reichard RW, Rosenberg M; 

PI Ward JM; 

XX 

DR WPI; 1998-252940/23. 

DR N-PSDB; AAV53520. 
XX 

PT New nucleic acid sequences from Staphylococcus aureus WCHU2 9 - 

PT useful in vaccines and for treatment of bacterial infections of e.g. 

PT respiratory tract and central nervous system 

XX 

PS Claim 11; Page 357-358; 390pp; English. 
XX 

CC This sequence represents a Staphylococcus aureus protein, that based on 

CC homology with a Rhizobium Meliloti protein, is a Exou protein, 

CC and is encoded by a DNA sequence of the invention. 

CC The DNA sequences were isolated from Staphylococcus aureus WCHU29 

CC (NCIMB 40771) . Host cells containing the DNA sequences are used to 

CC produce polypeptides or fragments. The proteins are used in the treatment 

CC of disease, for inducing an immune response by administering them, to 

CC produce antibody and/or T-cell immune response. Antagonists of the 

CC proteins are used for the inhibition of bacterial polypeptides. 

CC Conditions which may be treated include bacterial infections, especially 

CC respiratory, cardiac, gastrointestinal, central nervous, eye, kidney, 

CC urinary tract, skin, bones and joints. The proteins can also be used to 

CC identify antimicrobial compounds which are broad spectrum antibiotics, 

CC especially useful in the treatment of H. pylori infection. 

XX 

SQ Sequence 13 0 AA; 

Query Match 77.6%; Score 38; DB 19; Length 130; 



Best Local Similarity 77.8%; Pred. No. 52; 

Matches 7; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 CLSSRLDAC 9 

II II III 
Db 112 CLLSRCDAC 12 0 



RESULT 8 
ABG60523 

ID ABG60523 standard; Peptide; 9 AA. 
XX 

AC ABG60523; 
XX 

DT 30-JUL-2002 (first entry) 
XX 

DE Selective targeting peptide #198. 
XX 

KW Targeting peptide; cancer; arthritis; diabetes; inflammatory disease; 

KW atherosclerosis; autoimmune disease; bacterial infection; apoptosis; 

KW viral infection; cardiovascular disease; degenerative disease; ischaemia; 

KW inflammation; macular degeneration; antiinflammatory; antidiabetic; 

KW cardiovascular; immunomodulator; antibacterial; antiviral; cytostatic; 

KW gene therapy . 

XX 

OS Synthetic. 
XX 

PN WO200220769-A1. 
XX 

PD 14-MAR-2002. 
XX 

PF 07-SEP-2001; 2 001WO-US27692 . 
XX 

PR 08-SEP-2000; 2 000US-23 1266P . 

PR 17-JAN-2001; 2 001US-0765101 . 
XX 

PA (TEXA ) UNIV TEXAS SYSTEM . 
XX 

PI Arap W, Pasqualini R; 
XX 

DR WPI; 2002-415731/44. 
XX 

PT Targeting peptides identified by phage display, useful for targeting 

PT delivery to an organ or tissue, particularly for treating a disease, 

PT e.g. cancer, inflammatory or autoimmune diseases, infections or 

PT cardiovascular disease - 
XX 

PS Claim 22; Page 121; 317pp; English. 
XX 

CC The invention relates to an isolated peptide of 100 amino acids or less 

CC in size useful for targeting delivery to an organ or tissue, particularly 

CC for treating a disease, e.g. cancer, arthritis, diabetes, inflammatory 

CC disease, atherosclerosis, autoimmune disease, bacterial infection, viral 

CC infection, cardiovascular disease or degenerative disease. The peptide is 

CC also useful for inducing apoptosis, particularly to a subject with 

CC ischaemia, cancer, arthritis, diabetes, cardiovascular disease, 

CC inflammation or macular degeneration. Furthermore, the peptide is useful 



CC for diagnosing the diseases cited above. Targeting peptides of the 

CC invention can also be used to deliver an agent to a foetus, by attaching 

CC a peptide to the agent and administering the peptide to a pregnant 

CC subject. Sequences ABG60326-ABG60574 represent selective targeting 

CC peptides of the invention. 

XX 

SQ Sequence 9 AA; 

Query Match 75.5%; Score 37; DB 23; Length 9; 

Best Local Similarity 66.7%; Pred. No. 9.3e+05; 

Matches 6; Conservative 2; Mismatches 1; Indels 0; Gaps 0 

Qy 1 CLSSRLDAC 9 

Ihl HI! 
Db 1 CLASGMDAC 9 



RESULT 9 
ABG35135 

ID ABG35135 standard; Peptide; 9 AA. 
XX 

AC ABG35135; 
XX 

DT 15-JUL-2002 (first entry) 
XX 

DE Pancreatic islet targeting peptide #11. 
XX 

KW Targeting peptide; cancer; Hodgkin's disease; cytostatic; 

KW immunosuppressive; anti-inflammatory; ant iarthritic ; antiviral; 

KW antiatherosclerotic; antidiabetic; antibacterial; diabetes mellitus; 

KW inflammatory disease; arthritis; atherosclerosis; cancer; 

KW autoimmune disease; bacterial infection; viral infection. 

XX 

OS Unidentified. 
XX 

PN WO200220722-A2 . 
XX 

PD 14-MAR-2002. 
XX 

PF 07-SEP-2001; 2 001WO-US27702 . 
XX 

PR 08-SEP-2000; 2000US-231266P . 
PR 17-JAN-2001; 2001US-0765101 . 
XX 

PA (TEXA ) UNIV TEXAS SYSTEM. 
XX 

PI Arap W, Pasqualini R; 
XX 

DR WPI; 2002-383050/41. 
XX 

PT Identifying targeting peptides useful for treating e.g. diabetes 

PT mellitus, inflammatory diseases, cancer, or autoimmune diseases, 

PT comprises exposing a sample to a phage display library and recovering 

PT phage bound to the sample - 

XX 

PS Claim 56; Page 288; 298pp; English. 
XX 



CC This invention relates to a novel method for identifying disease 

CC targeting peptides. The method comprises exposing a sample from an 

CC organ, tissue or cell type of interest, to a phage display library and 

CC recovering phage bound to the sample (the phage expresses targeting 

CC peptides) . The peptides identified by the method of the invention may 

CC have cytostatic, immunosuppressive, ant i- inflammatory, antiarthritic, 

CC antiatherosclerotic, antidiabetic, antibacterial and antiviral 

CC activities. The methods and composition are useful for identifying 

CC targeting peptides and one or more receptors for a targeting peptide. 

CC The targeting peptides are used for selective delivery of therapeutic 

CC agents, including gene therapy vectors and fusion proteins, to specific 

CC organs, tissues, or cell types in subject. The targeting peptide may 

CC also be used for treating diseases such as diabetes mellitus, 

CC inflammatory diseases, arthritis, atherosclerosis, cancer, autoimmune 

CC diseases, bacterial and viral infections and Hodgkin's disease. The 

CC present sequence represents a targeting peptide of the invention. 

XX 

SQ Sequence 9 AA; 

Query Match 75.5%; Score 37; DB 23; Length 9; 

Best Local Similarity 66.7%; Pred. No. 9.3e+05; 

Matches 6; Conservative 2; Mismatches 1; Indels 0; Gaps 

Qy 1 CLSSRLDAC 9 

IN =111 
Db 1 CLASGMDAC 9 



RESULT 10 
AAM83339 

ID AAM83339 standard; Protein; 62 AA. 
XX 

AC AAM83339; 
XX 

DT 07-NOV-2001 (first entry) 
XX 

DE Human immune/haematopoietic antigen SEQ ID NO:10932. 
XX 

KW Human; immune; haematopoietic; immune/haematopoietic antigen; cancer; 

KW cytostatic; gene therapy; vaccine; metastasis. 

XX 

OS Homo sapiens. 
XX 

PN WO200157182-A2 . 
XX 
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2000US-0236367 
2000US-0236368 
2000US-0236369 
2000US-0236370 
2000US-0236802 
2000US-0237037 
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2000US-0237040 
2000US-0239935 
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2000US-0246609 
2000US-0246610 
2000US-0246611 
2000US-0246613 
2000US-0249207 
2000US-0249208 
2000US-0249209 
2000US-0249210 
2000US-0249211 
2000US-0249212 
2000US-0249213 
2000US-0249214 
2000US-0249215 
2000US-0249216 
2000US-0249217 
2000US-0249218 
2000US-0249244 
2000US-0249245 
2000US-0249264 
2000US-0249265 
2000US-0249297 
2000US-0249299 



PR 17-NOV-2000; 2 OOOUS- 024 9300 . 

PR 01-DEC-2000; 2000US-0250160 . 

PR 01-DEO2000; 2000US-0250391 . 

PR 05-DEO2000; 2000US-0251030 . 

PR 05-DEC-2000; 2000US-0251988 . 

PR 05-DEC-2000; 2000US-0256719 . 

PR 06-DEC-2000; 2000US-0251479 . 

PR 08-DEC-2000; 2000US-0251856 , 

PR 08-DEC-2000; 2000US-0251868 . 

PR 08-DEC-2000; 2000US-0251869 . 

PR 08-DEC-2000; 2000US-0251989 . 

PR 08-DEC-2000; 2000US-0251990 . 

PR ll-DEC-2000; 2000US-0254097. 

PR 05-JAN-2001; 2001US-0259678 . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Rosen CA, Barash SC, Ruben SM; 
XX 

DR WPI; 2001-483426/52. 

DR N-PSDB; AAK56120. 
XX 

PT Nucleic acids encoding human immune/hematopoietic antigen polypeptides, 

PT useful for preventing, diagnosing and/or treating cancers and 

PT metastasis - 
XX 

PS Claim 11; SEQ ID NO 10932; 3071pp + Sequence Listing; English. 
XX 

CC AAK54951 to AAK64702 encode the human immune/haematopoietic antigen (I) 

CC amino acid sequences given in AAM82170 to AAM91921. (I) have cytostatic 

CC activity, and can be used in gene therapy and vaccine production. (I) 

CC proteins and polynucleotides may be used in the prevention, diagnosis and 

CC treatment of diseases associated with inappropriate (I) expression. For 

CC example, they may be used to treat disorders associated with decreased 

CC expression by rectifying mutations or deletions in a patient's genome 

CC that affect the activity of (I) by expressing inactive proteins or to 

CC supplement the patients own production of (I) . Additionally, (I) 

CC polynucleotides may be used to produce the secreted (I) , by inserting 

CC the nucleic acids into a host cell and culturing the cell to express the 

CC protein. (I) proteins and polynucleotides may be used to prevent, 

CC diagnose and treat immune/haematopoietic -related diseases, especially 

CC cancers and cancer metastases of haematopoietic-derived cells. AAK64703 

CC to AAK87694 represent human immune/haematopoietic antigen genomic 

CC sequences from the present invention. AAK54942 to AAK54950 and AAM82169 

CC represent sequences used in the exemplification of the present invention. 

XX 

SQ Sequence 62 AA; 

Query Match 73.5%; Score 36; DB 22; Length 62; 

Best Local Similarity 100.0%; Pred. No. 56; 

Matches 7; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 1 CLSSRLD 7 

Illllll 
Db 53 CLSSRLD 59 



RESULT 11 
AAU29476 

ID AAU29476 standard; Protein; 209 AA. 
XX 

AC AAU2 9476; 
XX 

DT 18-DEC-2001 (first entry) 
XX 

DE Human G protein-coupled receptor (GPCR) polypeptide #97. 
XX 

KW Human; G protein-coupled receptor; GPCR; mental disorder; schizophrenia; 

KW neurological disorder; metabolic disorder; cancer; rheumatoid arthritis; 

KW thyroid disorder; neurodegenerative disorder; cardiovascular disorder; 

KW renal failure; autoimmune disorder; hyperprol iterative disorder; HIV; 

KW human immunodeficiency virus; viral infection; neuroprotective; 

KW immunostimulant ; neuroleptic; nootropic; tranquiliser ; antidepressant; 

KW anorectic; gene therapy. 
XX 

OS Homo sapiens. 
XX 

PN WO200168858-A2 . 
XX 

PD 20-SEP-2001. 
XX 

PF 16-MAR-2001; 2001WO-US08456 . 
XX 

PR 16-MAR-2000; 2000US-187783P . 

PR 16-MAR-2000; 2000US-189907P . 

PR 16-MAR-2000; 2000US-189917P . 

PR 16-MAR-2000; 2000US-189918P . 

PR 16-MAR-2000; 2000US-189960P . 

PR 29-MAR-2000; 2000US-192155P . 

PR 29-MAR-2000; 2000US-192234P . 

PR 29-MAR-2000; 2000US-192830P . 

PR 29-MAR-2000; 2000US-192916P . 

PR 29-MAR-2000; 2000US-192923P . 

PR 29-MAR-2000; 2000US-192933P . 

PR 29-MAR-2000; 2000US- 192945P . 
XX 

PA (PHAA ) PHARMACIA & UPJOHN CO. 
XX 

PI Vogeli G; 
XX 

DR WPI; 2001-607458/69. 

DR N-PSDB; AAS46915. 
XX 

PT Nucleic acid encoding G protein- coupled receptors, useful for the 

PT prevention, diagnosis and treatment of mental disorders - 

XX 

PS Claim 31; Page 91; 274pp; English. 
XX 

CC Sequences AAU2 938 0-AAU29509 represent human G protein-coupled receptor 

CC (GPCR) polypeptides of the invention. The proteins and the DNA sequences 

CC encoding them can be used to identify compounds which bind to GPCR 

CC polypeptides and in screening for compounds that modulate GPCR activity. 

CC By screening a human subject for the presence of mutations in GPCR DNA, a 

CC GPCR-related disorder or a genetic predisposition can be diagnosed. The 



CC sequences can also be used for treatment and prevention of mental 

CC disorders such as schizophrenia, neurological disorders such as manic 

CC depression, metabolic disorders such as obesity, cancer, rheumatoid 

CC arthritis, thyroid disorders such as myxoedema, neurodegenerative 

CC disorders such as Parkinson's disease, cardiovascular disorders such as 

CC atherosclerosis/ renal failure, autoimmune disorders, hyperprol iterative 

CC disorders such as psoriasis and viral infections such as those caused by 

CC HIV. 

XX 

SQ Sequence 209 AA; 

Query Match 73.5%; Score 36; DB 22; Length 209; 

Best Local Similarity 55.6%; Pred. No. 1.8e+02; 

Matches 5; Conservative 3; Mismatches 1; Indels 0; Gaps 0 

Qy 1 CLSSRLDAC 9 

h::||| 1 
Db 61 CMNNRLDPC 69 



RESULT 12 




ABG60764 




ID 


ABG60764 standard; Protein; 209 AA. 


XX 






AC 


ABG60764; 




XX 






DT 


13-AUG-2002 


(first entry) 


XX 






DE 


Novel G protein coupled receptor (nGCPR-x) #97. 


XX 






KW 


G protein coupled receptor; nGPCR-x; immune response; thyroid disorder; 


KW 


mental disorder; thyreotoxicosis; myxoedema; inflammatory condition; 


KW 


Crohn's disease; cell differentiation; homeostasis; rheumatoid arthritis 


KW 


renal failure; autoimmune disorder,- movement disorder; CNS disorder; 


KW 


viral infection; human immunodeficiency virus; HIV; metabolic disorder; 


KW 


cardiovascular disorder; diabetes; obesity; anorexia; cardiomyopathy ,- 


KW 


porlif erative disease; cancer; psoriasis; lung cancer; hormonal disorder 


KW 


sexual dysfunction. 


XX 






OS 


Homo sapiens 




XX 






PN 


US2002058306 


-Al. 


XX 






PD 


16-MAY-2002 . 




XX 






PF 


16-MAR-2001; 


2001US-0811284. 


XX 






PR 


16-MAR-2000; 


2000US-189783P. 


PR 


16-MAR-2Q00; 


2000US-189907P. 


PR 


16-MAR-2000; 


2000US-189917P. 


PR 


16-MAR-2000; 


2000US-189918P. 


PR 


16-MAR-2000; 


2000US-189960P. 


PR 


24-MAR-2000; 


2000US-192155P. 


PR 


27-MAR-2000; 


2000US-192234P. 


PR 


29-MAR-2000; 


2000US-192830P. 


PR 


29-MAR-2000; 


2000US-192945P. 


PR 


29-MAR-2000; 


2000US-192916P. 



PR 29-MAR-2000; 2000US-192923P , 

PR 29-MAR-2000; 2000US-19283 OP . 

PR 29-MAR-2000; 2000US-192945P . 

PR 29-MAR-2000; 2000US-192830P . 

PR 29-MAR-2000; 2000US- 192 945P . 

PR 29-MAR-2000; 2000US-192830P . 

PR 29-MAR-2000; 2000US-192945P . 
XX 

PA (VOGE/) VOGELI G. 
XX 

PI Vogeli G; 
XX 

DR WPI; 2002-434856/46. 

DR N-PSDB; ABK81693. 
XX 

PT New isolated nucleic acid encoding a G protein coupled receptor for 

PT producing the receptor which can induce an immune response in a mammal 
PT 
XX 

PS Claim 27; Page 67; 216pp; English. 
XX 

CC The invention describes an isolated nucleic acid (I) comprising a 

CC sequence encoding a portion of a G protein coupled receptor (nGPCR-x) . 

CC (I) is used to produce a recombinant nGPRC-x polypeptide. A polypeptide 

CC encoded by (I) is used to induce an immune response in a mammal. nGPRC-* 

CC is used to identify a compound that binds to it and/or modulates it's 

CC activity. (I) is used to identify animal homologues of nGPCR-x. (I) can 

CC be used to diagnose a human subject as having a brain or genetic 

CC predisposition disorder, such as a mental disorder. (I) is used to 

CC screen for an nGPCR-x related disorder including thyroid disorders (e.g. 

CC thyreotoxicosis, myxoedema) , renal failure, inflammatory conditions (e.g 

CC Crohn's disease), diseases related to cell differentiation and 

CC homeostasis, rheumatoid arthritis, autoimmune disorders, movement 

CC disorders, CNS disorders-, viral infections (e.g. Human immunodeficiency 

CC virus), metabolic and cardiovascular disorders (e.g. diabetes, obesity, 

CC anorexia, cardiomyopathies) , porlif erative diseases and cancers (e.g. 

CC psoriasis, lung cancer) , hormonal disorders, sexual dysfunction and 

CC hereditary mental disorders in a human patient. A host cell comprising 

CC (I) is used to screen for a modulator of nGPCR-x activity. nGPCR-x is 

CC used to identify compounds that can treat mental disorders. The 

CC polypeptide encoded by (I) is used to purify a G protein from a sample. 

CC This is the amino acid sequence of a novel G protein coupled receptor 

CC (nGPCR-x) protein described in the invention. 

XX 

SQ Sequence 209 AA; 

Query Match 73,5%; Score 36; DB 23; Length 209; 

Best Local Similarity 55.6%; Pred. No. 1.8e+02; 

Matches 5; Conservative 3; Mismatches 1; Indels 0; Gaps 

Qy 1 CLSSRLDAC 9 

h = :||| I 
Db 61 CMNNRLDPC 6 9 

RESULT 13 
AAY01191 



ID AAY01191 standard; Protein; 53 AA. 
XX 

AC AAY01191; 
XX 

DT 18-MAY-1999 (first entry) 
XX 

DE Polypeptide fragment encoded by gene 14. 
XX 

KW Human; secreted protein; gene therapy; protein therapy; tissue; cancer; 

KW tumour; neurodegenerative disorder; leukaemia; autoimmune disease; AIDS; 

KW developmental abnormality; foetal deficiency; Alzheimer's disease; 

KW cognitive disorder; schizophrenia; immunological disorder; mood disorder; 

KW immune deficiency disease; respiratory disorder; arthritis; skeletal; 

KW haematopoietic disorder; neural; osteoporosis; metabolic disorders; 

KW cardiovascular; endocrine; gastrointestinal; asthma; diagnosis. 

XX 

OS Homo sapiens. 
XX 

PN WO9901020-A2 . 
XX 

PD 14-JAN-1999. 
XX 

PF 30-JUN-1998; 98WO-US13608 . 
XX 

PR 12-SEP-1997; 97US - 0058 663 . 

PR 01-JUL-1997; 97US- 005138 1 . 

PR 01-JUL-1997; 97US-005148 0 . 

PR 12-SEP-1997; 97US - 0058 598 . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Carter KC, Endress GA, Feng P, Rosen CA, Ruben SM; 
XX 

DR WPI; 1999-105683/09. 

DR N-PSDB; AAX22124. 
XX 

PT New isolated human genes and the secreted polypeptides they encode - 

PT useful for diagnosis and treatment of e.g. cancers, neurological 

PT disorders, immune diseases, immune deficiency diseases or blood 

PT disorders 
XX 

PS Disclosure; Page 24; 179pp; English. 
XX 

CC The invention relates to nucleic acid sequences (AAX22111 to AAX22134) 

CC encoding human secreted proteins (AAY01135 to AAY01158) . The secreted 

CC protein gene sequences are deposited with the ATCC under deposit number 

CC ATCC 209118, Host cells comprising recombinant vectors containing the 

CC nucleic acid sequences are used for the recombinant production of the 

CC secreted proteins. The polynucleotide and amino acid sequences are useful 

CC for are useful for preventing, treating or ameliorating medical 

CC conditions e.g. by protein or gene therapy. Pathological conditions can 

CC be also diagnosed by determining the amount of the new polypeptides in a 

CC sample or by determining the presence of mutations in the new 

CC polynucleotides. Specific uses are described for each of the 

CC polynucleotides, based on which tissues they are most highly expressed 

CC in, and include developing products for the diagnosis or treatment of 

CC cancer, tumours, developmental abnormalities and foetal deficiencies, 



CC autoimmune diseases, lymphomas, Alzheimer's and cognitive disorders, 

CC schizophrenia, immunological disorders, immune deficiency diseases 

CC (AIDS), mood disorders, respiratory disorders, arthritis, asthma, 

CC haematopoietic disorders, neural disorders, skeletal disorders, 

CC osteoporosis, metabolic disorders, cardiovascular disorders, endocrine 

CC disorders or gastrointestinal disorders. The polypeptides are also useful 

CC for identifying their binding partners. The present sequence represents a 

CC polypeptide fragment encoded by a gene of the invention (see descriptor 

CC line for gene number) . 

XX 

SQ Sequence 53 AA; 

Query Match 71.4%; Score 35; DB 20; Length 53; 

Best Local Similarity 55.6%; Pred. No. 72; 

Matches 5; Conservative 2; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 CLSSRLDAC 9 

h: Ml I 
Db 4 0 CIAGRLDIC 48 



RESULT 14 
AAU61079 

ID AAU6107 9 standard; Protein; 8 0 AA. 
XX 

AC AAU61079; 
XX 

DT 27-FEB-2002 (first entry) 
XX 

DE Propionibacterium acnes immunogenic protein #21975. 
XX 

KW SAPHO syndrome; synovitis; acne; pustulosis; hypertosis; osteomyelitis; 

KW uveitis; endophthalmitis; bone; joint; central nervous system; ELISA; 

KW inflammatory lesion; acne vulgaris; enzyme linked immunosorbent assay; 

KW dermatological ; osteopathic; neuroprotectant . 
XX 

OS Propionibacterium acnes. 
XX 

PN WO200181581-A2. 
XX 

PD 01-NOV-2001. 
XX 

PF 20-APR-2001; 2001WO-US12865 . 
XX 

PR 21-APR-2000; 2000US- 199047P . 

PR 02-JUN-2000; 2000US-208841P . 

PR 07-JUL-2000; 2000US-216747P . 
XX 

PA (CORI-) CORIXA CORP. 
XX 

PI Skeiky YAW, Persing DH, Mitcham JL, Wang SS, Bhatia A; 

PI L'maisonneuve J, Zhang Y, Jen S, Carter D; 

XX 

DR WPI; 2001-616774/71. 

DR N-PSDB; AAS59613. 
XX 

PT Propionibacterium acnes polypeptides and nucleic acids useful for 



PT vaccinating against and diagnosing infections, especially useful for 

PT treating acne vulgaris - 

XX 

PS Example 1; SEQ ID No 22274; 1069pp; English. 
XX 

CC Sequences AAU3 9105-AAU68 017 represent Propionibacterium acnes immunogenic 

CC polypeptides. The proteins and their associated DNA sequences are used in 

CC the treatment, prevention and diagnosis of medical conditions caused by 

CC P. acnes. The disorders include SAPHO syndrome (synovitis, acne, 

CC pustulosis, hypertosis and osteomyelitis), uveitis and endophthalmitis. 

CC P. acnes is also involved in infections of bone, joints and the central 

CC nervous system, however it is particularly involved in the inflammatory 

CC lesions associated with acne vulgaris. A method for detecting the 

CC presence or absence of P. acnes in a patient comprises contacting a 

CC sample with a binding agent that binds to the proteins of the invention 

CC and determining the amount of bound protein in the sample. The 

CC polypeptides may be used as antigens in the production of antibodies 

CC specific for P. acnes proteins. These antibodies can be used to 

CC downregulate expression and activity of P. acnes polypeptides and 

CC therefore treat P. acnes infections. The antibodies may also be used as 

CC diagnostic agents for determining P. acnes presence, for example, by 

CC enzyme linked immunosorbent assay (ELISA) . 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp.wipo.int/pub/published_pct_sequences, 
XX 

SQ Sequence 80 AA; 

Query Match 71.4%; Score 35; DB 22; Length 80; 

Best Local Similarity 87.5%; Pred. No. l.le+02; 

Matches 7; Conservative 0; Mismatches 1; Indels 0; Gaps 0 

Qy 1 CLSSRLDA 8 

I I I I III 
Db 52 CLSSHLDA 59 



RESULT 15 
AAM06519 

ID AAM06519 standard; Protein; 87 AA. 
XX 

AC AAM06519; 
XX 

DT 05-OCT-2001 (first entry) 
XX 

DE Human foetal protein, SEQ ID NO: 250. 
XX 

KW Human; foetal protein; cytostatic; immunosuppressive; immunostimulant ; 

KW nootropic; neuroprotective; thrombolytic; osteopathic; antiinflammatory; 

KW gene therapy; antisense therapy; cancer; immune disorder; 

KW growth disorder; osteoporosis; thrombolytic disorder; 

KW nervous system disorder; inflammation, 

XX 

OS Homo sapiens. 
XX 

PN WO200155339-A2. 
XX 



PD 02-AUG-2 0 01. 
XX 

PF 25-JAN-2001; 2 0 01WO-US02723 . 
XX 

PR 25-JAN-2000; 2 0 OOUS - 04914 04 . 

PR 15-SEP-2000; 2000US-0663870 . 

PR 06-NOV-2000; 2 0 OOUS - 07073 51 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Yeung G, Ford JE, Boyle BJ, Arterburn MC, Drmanac RA, Tang YT; 

PI Liu C, Asundi V, Zhou P, Werhman T; 

XX 

DR WPI; 2001-465571/50. 

DR N-PSDB; AAH94194. 
XX 

PT Novel fetal proteins useful for the treatment and diagnosis of diseases 

PT associated with dysfunction of the protein e.g. cancers, immune 

PT disorders, growth disorders, thrombolytic disorders, nervous system 

PT disorders and inflammation - 

XX 

PS Claim 10; Page 271-272; 715pp ; English. 
XX 

CC The invention relates to novel foetal polypeptides encoded by 

CC polynucleotides comprising one of 477 sequences fully defined in the 

CC specification. The foetal polynucleotides and polypeptides are 

CC useful in the treatment and diagnosis of diseases such as cancers, 

CC immune disorders, growth disorders (e.g. osteoporosis), thrombolytic 

CC disorders, nervous system disorders and inflammation. The present 

CC sequence is a polypeptide encoded by a cDNA assembled using 

CC an expressed sequence tag (EST) found to be expressed in human 

CC foetal tissue cDNA libraries. 

XX 

SQ Sequence 87 AA; 

Query Match 71.4%; Score 35; DB 22; Length 87; 

Best Local Similarity 66.7%; Pred. No. l.le+02; 

Matches 6; Conservative 2; Mismatches 1; Indels 0; Gaps 

Qy 1 CLSSRLDAC 9 

III lh:| 
Db 76 CLSVRLNSC 84 



Search completed: November 13, 2003, 09:45:23 
Job time : 31.2812 sees 

GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



November 13, 2003, 09:45:35 ; Search time 18.6562 Seconds 

(without alignments) 
88.069 Million cell updates/sec 



Title: 

Perfect score: 
Sequence: 



US-09-228-866-3 
49 

1 CLSSRLDAC 9 



666188 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 666188 seqs, 182559486 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Published_Applications_AA: * 

1 : /cgn2_6/ptodata/2/pubpaa/US07_PUBCOMB . pep : * 

2 : /cgn2_6/ptodata/2/pubpaa/PCT_NEW_PUB.pep: * 

3 : /cgn2_6/ptodata/2/pubpaa/US06 __NEW_PUB . pep : * 

4 : / cgn2_6/ptodata/2/pubpaa/US06_PUBCOMB . pep : * 

5 : /cgn2_6/ptodata/2/pubpaa/US07_NEW_PUB .pep : * 

6 : / cgn2_6/ptodata/2/pubpaa/PCTUS_PUBCOMB . pep : * 

7 : /cgn2_6/ptodata/2/pubpaa/US08 JSfEW__PUB . pep : * 

8 : / cgn2_6/ptodata/2/pubpaa/US08_PUBCOMB . pep : * 

9 : /cgn2_6/ptodata/2/pubpaa/US09A_PUBCOMB.pep: * 

10: /cgn2_6/ptodata/2 /pubpaa / US 0 9 B__PUB COMB . p ep : * 

11 : /cgn2_6/ptodata/2/pubpaa/US0 9C_PUBCOMB.pep:* 

12 : /cgn2_6/ptodata/2/pubpaa/US09_NEW_PUB.pep: * 

13 : /cgn2_6/ptodata/2/pubpaa/US10A_PUBCOMB.pep:* 

14 : /cgn2_6/ptodata/2/pubpaa/US10B_PUBCOMB.pep: * 

15 : /cgn2_6/ptodata/2/pubpaa/US10C_PUBCOMB.pep:* 

16 : /cgn2_6/ptodata/2/pubpaa/US10_NEW_PUB.pep: * 

17 : /cgn2_6/ptodata/2/pubpaa/US60_NEW__PUB . pep : * 

18 : /cgn2_6/ptodata/2/pubpaa/US60_PUBCOMB.pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 

US-09-939-980-491 

; Sequence 491, Application US/09939980 
; Patent No. US20020082234A1 
GENERAL INFORMATION: 

APPLICANT: Black, Michael 
; Burnham, Martin 

; Hodgson, John 

; Knowles, David 

/ Lonetto, Michael 

; Nicholas, Richard 

; Pratt, Julie 

Reichard, Richard 
; Rosenberg, Martin 

; Ward, Judith 

/ TITLE OF INVENTION: No. US20020082234Alel Prokaryotic Polynucleotides, 

; Polypeptides and Their Uses 



NUMBER OF SEQUENCES: 534 
CORRESPONDENCE ADDRESS : 
; ADDRESSEE: SmithKline Beecham Corporation 

; STREET: 709 Swedeland Road 

; CITY: King of Prussia 

STATE : PA 
COUNTRY : USA 
ZIP: 19406-0939 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/ 93 9 , 980 

FILING DATE: 27-Aug-2001 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/936,165 

FILING DATE: <Unknown> 
ATTORNEY /AGENT INFORMATION: 

NAME: Gimmi, Edward R 

REGISTRATION NUMBER: 38,891 

REFERENCE/DOCKET NUMBER: P5054 9 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 610-270-4478 

TELEFAX: 610-270-5090 

TELEX : < Unknown > 
INFORMATION FOR SEQ ID NO: 491: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 130 amino acids 

TYPE: amino acid 

STRANDEDNESS : single 

TOPOLOGY: linear 
MOLECULE TYPE: Protein 
; SEQUENCE DESCRIPTION: SEQ ID NO: 491: 

US-09-939-980-491 

Query Match 77.6%; Score 38; DB 9; Length 130; 

Best Local Similarity 77.8%; Pred. No. 13; 

Matches 7; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 CLSSRLDAC 9 

II II III 
Db 112 CLLSRCDAC 12 0 



RESULT 2 

US-10-156-761-13501 

Sequence 13501, Application US/10156761 
Publication No. US20030119018A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



OMURA, SATOSHI 
IKEDA, HARUO 
ISHIKAWA, JUN 
HORIKAWA, HIROSHI 
SHIBA, TADAYOSHI 



; APPLICANT: SAKAKI , YOSHIYUKI 

; APPLICANT: HATTORI , MASAHIRA 

; TITLE OF INVENTION: NOVEL POLYNUCLEOTIDES 

; FILE REFERENCE: 249-262 

; CURRENT APPLICATION NUMBER: US/10/156,761 

; CURRENT FILING DATE: 2002-05-29 

; PRIOR APPLICATION NUMBER: JP 2001-204089 

; PRIOR FILING DATE: 2001-05-30 

; PRIOR APPLICATION NUMBER: JP 2001-272697 

; PRIOR FILING DATE: 2001-08-02 

; NUMBER OF SEQ ID NOS : 15109 

; SEQ ID NO 13501 

LENGTH: 2 97 

TYPE : PRT 

ORGANISM: Streptomyces avermitilis 
US-10-156-761-13501 

Query Match 77.6%; Score 38; DB 15; Length 297; 

Best Local Similarity 66.7%; Pred. No. 30; 

Matches 6; Conservative 2; Mismatches 1; Indels 0; Gaps 

Qy 1 CLSSRLDAC 9 

II : : I I I I 
Db 2 03 CLDAQLDAC 211 



RESULT 3 

US-09-811-284-225 

; Sequence 225, Application US/09811284 

; Patent No. US20020058306A1 

; GENERAL INFORMATION: 

; APPLICANT: Vogeli, Gabriel 

; TITLE OF INVENTION: No. US2 00200583 06Alel G Protein-Coupled Receptors 

; FILE REFERENCE: 0 0167US1 

; CURRENT APPLICATION NUMBER: US/09/811,284 

; CURRENT FILING DATE: 2001-03-16 

; PRIOR APPLICATION NUMBER: 60/189,783 

; PRIOR FILING DATE: 2000-03-16 

; PRIOR APPLICATION NUMBER: 60/189,907 

; PRIOR FILING DATE: 2000-03-16 

; PRIOR APPLICATION NUMBER: 60/189,918 

; PRIOR FILING DATE: 2000-03-16 

; PRIOR APPLICATION NUMBER: 60/189,960 

; PRIOR FILING DATE: 2000-03-16 

; PRIOR APPLICATION NUMBER: 60/189,917 

; PRIOR FILING DATE: 2000-03-16 

; PRIOR APPLICATION NUMBER: 60/192,945 

; PRIOR FILING DATE: 2000-03-29 

; PRIOR APPLICATION NUMBER: 60/192,916 

; PRIOR FILING DATE: 2000-03-29 

; PRIOR APPLICATION NUMBER: 60/192,923 

; PRIOR FILING DATE: 2000-03-29 

; PRIOR APPLICATION NUMBER: 60/192,933 

; PRIOR FILING DATE: 2000-03-29 

; PRIOR APPLICATION NUMBER: 60/192,830 

; PRIOR FILING DATE: 2000-03-29 

; PRIOR APPLICATION NUMBER: 60/192,234 



; PRIOR FILING DATE : 2000-03-27 

; PRIOR APPLICATION NUMBER: 60/192,155 

; PRIOR FILING DATE: 2000-03-24 

/ PRIOR APPLICATION NUMBER: 60/192,935 

; PRIOR FILING DATE: 2000-03-29 

; NUMBER OF SEQ ID NOS : 258 

/ SOFTWARE: Patentln version 3.0 

; SEQ ID NO 225 

LENGTH: 2 09 

TYPE : PRT 

ORGANISM: Homo sapiens 
US-09-811-284-225 



Query Match 73.5%; Score 36; DB 9; Length 209; 

Best Local Similarity 55.6%; Pred. No. 49; 

Matches 5; Conservative 3; Mismatches 1; Indels 0; Gaps 

Qy 1 CLSSRLDAC 9 

h-lli I 
Db 61 CMNNRLDPC 69 



RESULT 4 

US-09-820-893-65 

; Sequence 65, Application US/09820893 

; Patent No. US20020076705A1 

; GENERAL INFORMATION: 

; APPLICANT: Rosen et al . 

; TITLE OF INVENTION: 31 Human Secreted Proteins 

FILE REFERENCE: PZ033P1 
; CURRENT APPLICATION NUMBER: US/ 09/82 0 , 8 93 
; CURRENT FILING DATE: 2001-03-30 
; PRIOR APPLICATION NUMBER: 09/531,119 
; PRIOR FILING DATE: 2000-03-20 
; PRIOR APPLICATION NUMBER: 60/102,895 
; PRIOR FILING DATE: 1998-10-02 
; NUMBER OF SEQ ID NOS: 14 0 
; SOFTWARE: Patentln Ver. 2.0 
; SEQ ID NO 65 

LENGTH: 114 

TYPE : PRT 

ORGANISM: Homo sapiens 
FEATURE : 
NAME/ KEY: SITE 
LOCATION: (114) 

OTHER INFORMATION: Xaa equals stop translation 
US-09-820-893-65 



Query Match 69.4%; Score 34; DB 9; Length 114; 

Best Local Similarity 55.6%; Pred. No. 64 ; 

Matches 5; Conservative 2; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 CLSSRLDAC 9 

hi hi I 
Db 20 CISFRVDVC 28 



RESULT 5 

US-09-820-893-100 

; Sequence 100, Application US/09820893 

/ Patent No. US20020076705A1 

; GENERAL INFORMATION: 

; APPLICANT: Rosen et al . 

; TITLE OF INVENTION: 31 Human Secreted Proteins 
; FILE REFERENCE: PZ033P1 

; CURRENT APPLICATION NUMBER: US/ 09/82 0 , 8 93 
; CURRENT FILING DATE: 2001-03-30 

PRIOR APPLICATION NUMBER: 09/531,119 
/ PRIOR FILING DATE: 2000-03-20 
; PRIOR APPLICATION NUMBER: 60/102,895 
; PRIOR FILING DATE: 1998-10-02 
; NUMBER OF SEQ ID NOS : 14 0 
; SOFTWARE: Patent In Ver. 2.0 
; SEQ ID NO 100 

LENGTH: 132 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-820-893-100 

Query Match 69.4%; Score 34; DB 9; Length 132; 

Best Local Similarity 55.6%; Pred. No. 74; 

Matches 5; Conservative 2; Mismatches 2; Indels 0; Gaps 0 
Qy 1 CLSSRLDAC 9 

hi hi I 

Db 3 9 CISFRVDVC 47 



RESULT 6 

US-09-798-412-11 

; Sequence 11, Application US/09798412 

; Publication No. US20030109428A1 

; GENERAL INFORMATION: 

; APPLICANT: Bert in, John 

; TITLE OF INVENTION: NOVEL MOLECULES OF THE CARD-RELATED 
; TITLE OF INVENTION: PROTEIN FAMILY AND USES THEREOF 
; FILE REFERENCE: 07334-327001 

CURRENT APPLICATION NUMBER: US/ 09/798 , 4 12 
; CURRENT FILING DATE: 2 001-03-02 

PRIOR APPLICATION NUMBER: US 09/728,260 
; PRIOR FILING DATE: 2000-12-01 

PRIOR APPLICATION NUMBER: US 09/685,791 

PRIOR FILING DATE: 2000-10-10 
; PRIOR APPLICATION NUMBER: US 09/513,904 
; PRIOR FILING DATE: 2000-02-25 
; PRIOR APPLICATION NUMBER: US 09/507,533 

PRIOR FILING DATE: 2000-02-18 

PRIOR APPLICATION NUMBER: US 60/168,780 
; PRIOR FILING DATE: 1999-12-03 
; NUMBER OF SEQ ID NOS: 19 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 11 

LENGTH: 1147 
TYPE : PRT 



ORGANISM: Homo sapiens 
US-09-798-412-11 

Query Match 69.4%; Score 34; DB 11; Length 1147; 

Best Local Similarity 75.0%; Pred. No. 5.8e+02; 

Matches 6; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 
Qy 2 LSSRLDAC 9 

Db 777 ISSQLDAC 784 



RESULT 7 

US-10-325-917-11 

; Sequence 11 , Application US/10325917 

; Publication No. US20030113787A1 

; GENERAL INFORMATION: 

; APPLICANT: Bert in, John 

; TITLE OF INVENTION: NOVEL MOLECULES OF THE CARD-RELATED 

; TITLE OF INVENTION: PROTEIN FAMILY AND USES THEREOF 

; FILE REFERENCE: 07334-327001 

; CURRENT APPLICATION NUMBER: US/10/32 5 , 9 17 

; CURRENT FILING DATE: 2002-12-20 

; PRIOR APPLICATION NUMBER: US/09/798,412 

; PRIOR FILING DATE: 2001-03-02 

; PRIOR APPLICATION NUMBER: US 09/728,260 

; PRIOR FILING DATE: 2000-12-01 

PRIOR APPLICATION NUMBER: US 09/685,791 
; PRIOR FILING DATE: 2000-10-10 
; PRIOR APPLICATION NUMBER: US 09/513,904 
; PRIOR FILING DATE: 2000-02-25 
; PRIOR APPLICATION NUMBER: US 09/507,533 
; PRIOR FILING DATE: 2000-02-18 
; PRIOR APPLICATION NUMBER: US 60/168,780 
; PRIOR FILING DATE: 1999-12-03 
; NUMBER OF SEQ ID NOS : 19 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 11 

LENGTH: 1147 
TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-325-917-11 



Query Match 69.4%; 
Best Local Similarity 75.0%; 
Matches 6; Conservative 

Qy 2 LSSRLDAC 9 

Db 777 ISSQLDAC 784 



Score 34; DB 15; Length 1147; 
Pred. No. 5.8e+02; 
2; Mismatches 0; Indels 0; Gaps 0; 



RESULT 8 

US-10-032-159A-8 

; Sequence 8, Application US/10032159A 
; Publication No. US20020164703A1 
; GENERAL INFORMATION: 



; APPLICANT: Pawlowski, Krzysztof 
; APPLICANT : Reed, John C. 
; APPLICANT: Godzik, Adam 

; TITLE OF INVENTION: CARD -DOMAIN CONTAINING POLYPEPTIDES, 

; TITLE OF INVENTION: ENCODING NUCLEIC ACIDS, AND METHODS OF USE 

; FILE REFERENCE: P-LJ 5100 

; CURRENT APPLICATION NUMBER: US/10/032 , 159A 

; CURRENT FILING DATE: 2001-12-19 

; PRIOR APPLICATION NUMBER: US 60/257,457 

; PRIOR FILING DATE: 2000-12-21 

; NUMBER OF SEQ ID NOS : 37 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 8 

LENGTH: 1247 

TYPE : PRT 

ORGANISM: Homo sapiens 
US-10-032-159A-8 



Query Match 69.4%; Score 34; DB 14; Length 124 7; 

Best Local Similarity 75.0%; Pred. No. 6.3e+02; 

Matches 6; Conservative 2; Mismatches 0; Indels 0; Gaps 



Qy 2 LSSRLDAC 9 

= 11:1111 
Db 909 ISSQLDAC 916 



RESULT 9 
US-09-918-508-8 

; Sequence 8, Application US/09918508 

; Patent No. US20020177162A1 

; GENERAL INFORMATION: 

; APPLICANT: KAKIMOTO, TATSUO 

; APPLICANT: HIGUCHI , MASAYUKI 

; APPLICANT: INOUE, TSUTOMU 

; TITLE OF INVENTION: ANALYSIS OF AGONIST -ACTIVITY AND ANTAGONIST -ACTIVITY 
; TITLE OF INVENTION: TO CYTOKININ RECEPTOR 
; FILE REFERENCE: Q65478 

; CURRENT APPLICATION NUMBER: US/09/918 , 508 

; CURRENT FILING DATE: 2001-08-01 

; PRIOR APPLICATION NUMBER: JP 2001-073812 

; PRIOR FILING DATE: 2001-03-15 

; NUMBER OF SEQ ID NOS: 22 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 8 

LENGTH: 118 

TYPE: PRT 
; ORGANISM: Escherichia coli 
US-09-918-508-8 



Query Match 67.3%; Score 33; DB 10; Length 118; 

Best Local Similarity 55.6%; Pred. No. le+02; 

Matches 5; Conservative 2; Mismatches 2; Indels 0; Gaps 



Qy 1 CLSSRLDAC 9 

II I =hl 

Db 93 CLESGMDSC 101 



RESULT 10 
US-09-764-864-877 

/ Sequence 877, Application US/09764864 

; Patent No. US20020132753A1 

; GENERAL INFORMATION: 

; APPLICANT: Rosen et al . 

; TITLE OF INVENTION: Nucleic Acids, Proteins, and Antibodies 
; FILE REFERENCE: PTZ23 

; CURRENT APPLICATION NUMBER: US/09/764 , 8 64 
; CURRENT FILING DATE: 2001-01-17 

Prior application data removed - consult PALM or file wrapper 
; NUMBER OF SEQ ID NOS : 1792 
; SOFTWARE: Pa tent In Ver. 2.0 
; SEQ ID NO 877 

LENGTH: 175 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-764-864-877 

Query Match 67.3%; Score 33; DB 10; Length 175; 

Best Local Similarity 66.7%; Pred. No. 1.5e+02; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 CLSSRLDAC 9 

Mill I 
Db 5 CLSSRCSLC 13 



RESULT 11 
US-10-321-802-2 

Sequence 2, Application US/10321802 
Publication No. US20030200563A1 
GENERAL INFORMATION: 
APPLICANT: Butler, Karlene H. 
APPLICANT: Cahoon, Edgar B. 
APPLICANT: Cahoon, Rebecca E. 
APPLICANT: Famodu, Omolayo O. 
APPLICANT: Hall, Sarah E. 

TITLE OF INVENTION: Phophol ipid : diacylglycerol Acetyltransf erases 
FILE REFERENCE: BB148 6 US NA 
CURRENT APPLICATION NUMBER: US/10/321 , 802 
CURRENT FILING DATE: 2002-12-17 
NUMBER OF SEQ ID NOS: 36 
SOFTWARE: Microsoft Office 97 
SEQ ID NO 2 
LENGTH: 537 
TYPE: PRT 

ORGANISM: Momordica charantia 
US-10-321-802-2 

Query Match 67.3%; Score 33; DB 12; Length 537; 

Best Local Similarity 66.7%; Pred. No. 4.3e+02; 

Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 



1 CLSSRLDAC 9 



Db 503 CPSSRAEAC 511 



RESULT 12 

US-09-866-050A-303 

Sequence 303, Application US/09866050A 
Publication No. US20030040471A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Watson, James D. 
Strachan, Lorna 
Sleeman, Matthew 
Onrust, Rene 
Murison, James G. 
Kumble, Krishanand D. 
TITLE OF INVENTION: Compositions Isolated From Skin Cells 
TITLE OF INVENTION: and Methods for Their Use 
FILE REFERENCE: 11000 . 1011C4U 
CURRENT APPLICATION NUMBER: US/09/866 , 050A 
CURRENT FILING DATE: 2001-05-24 
NUMBER OF SEQ ID NOS : 725 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 3 03 
LENGTH: 617 
TYPE : PRT 
ORGANISM: Mouse 
US-09-866-050A-303 

Query Match 67.3%; Score 33; DB 11; Length 617; 

Best Local Similarity 55,6%; Pred. No. 4,9e+02; 

Matches 5; Conservative 2; Mismatches 2; Indels 0; Gaps 

Qy 1 CLSSRLDAC 9 

II =lh I 
Db 584 CLKNRLEQC 592 



RESULT 13 
US-10-203-860-6 

Sequence 6, Application US/10203860 
Publication No. US20030108904A1 
GENERAL INFORMATION: 
APPLICANT: WAKAMIYA, No. US20030108904Alutaka 
TITLE OF INVENTION: No. US2 0030108904Alel Scavenger Receptor 
FILE REFERENCE: 19036/38693 
CURRENT APPLICATION NUMBER: US/10/203 , 860 
CURRENT FILING DATE: 2002-08-14 
PRIOR APPLICATION NUMBER: 2000-35155 
PRIOR FILING DATE: 2000-02-14 
PRIOR APPLICATION NUMBER: 2000-3 09068 
PRIOR FILING DATE: 2000-10-10 
NUMBER OF SEQ ID NOS: 28 
SEQ ID NO 6 
LENGTH: 27 
TYPE: PRT 

ORGANISM: Artificial Sequence 
FEATURE : 



; OTHER INFORMATION: Modified Consensus Sequence of collectins Hybridisabl 
with No. US20030108904Alel 

OTHER INFORMATION: Col lectin. 
US-10-203-860-6 

Query Match 65.3%; Score 32; DB 15; Length 27; 

Best Local Similarity 66.7%; Pred. No. 38; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 

Qy 1 CLSSRLDAC 9 

II III I 
Db 17 CLQSRLAIC 25 



RESULT 14 
US-10-012-542-203 

; Sequence 203, Application US/10012542 

; Publication No. US20030044851A1 

; GENERAL INFORMATION: 

; APPLICANT: Ruben et al . 

; TITLE OF INVENTION: 94 Human Secreted Proteins 
; FILE REFERENCE: PZ029P1 

; CURRENT APPLICATION NUMBER: US/l 0/ 0 12 , 542 
; CURRENT FILING DATE: 2001-12-12 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 09/461,325 
; PRIOR FILING DATE: EARLIER FILING DATE: 1999-12-14 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/089,507 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-16 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/089,508 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-16 
; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/089,509 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-16 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/089,510 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-16 
; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/090,112 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-22 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/090,113 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-22 
; NUMBER OF SEQ ID NOS : 532 
; SOFTWARE: Patentln Ver. 2.0 
; SEQ ID NO 2 03 

LENGTH: 143 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-012-542-203 

Query Match 65.3%; Score 32; DB 15; Length 143; 

Best Local Similarity 100.0%; Pred. No. 1.8e+02; 

Matches 6; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 4 SRLDAC 9 

MINI 

Db 51 SRLDAC 56 



RESULT 15 
US-10-275-555-2 



; Sequence 2, Application US/10275555 
; Publication No. US20030104450A1 
; GENERAL INFORMATION: 
; APPLICANT: Merck Patent GmbH 

; TITLE OF INVENTION: No. US20030104450Alel regulator of G protein signalling 
(RGS8) 

; FILE REFERENCE: RGS8CWWS 

; CURRENT APPLICATION NUMBER: US/10/275 , 555 
; CURRENT FILING DATE: 2002-11-07 
; NUMBER OF SEQ ID NOS : 2 

SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 2 

LENGTH: 18 0 

TYPE : PRT 

ORGANISM: Homo sapiens 
US-10-275-555-2 

Query Match 65.3%; Score 32; DB 15; Length 18 0; 

Best Local Similarity 55.6%; Pred. No. 2.3e+02; 

Matches 5; Conservative 2; Mismatches 2; Indels 0; Gaps 0; 
Qy 1 CLSSRLDAC 9 



Db 




hi 

DSC 2 7 



Search completed: November 13, 2003, 09:58:27 
Job time : 18.6562 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



November 13, 2003, 09:38:30 



; Search time 9.375 Seconds 
(without alignments) 
92,322 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-09-228-866-3 
49 

1 CLSSRLDAC 9 



Scoring table: 



BLOSUM62 
Gapop 10.0 



Gap ex t 0 . 5 



Searched: 



283308 seqs, 96168682 residues 



Total number of hits satisfying chosen parameters: 



283308 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : PIR_76:* 
1: pirl:* 
2: pir2:* 
3: pir3:* 
4: pir4:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
T28675 

alpha-51D immobilization antigen - Paramecium tetraurelia 
C; Species: Paramecium tetraurelia 

C;Date: 15-Oct-1999 #sequence_revis ion 15-Oct-1999 #text_change 20-Jun-2000 
C; Access ion: T28675 
R ; Schwegmann , K.J. 

submitted to the EMBL Data Library, March 1996 
A; Reference number: Z20506 
A; Accession : T28 675 

A; Status : preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A;Residues: 1-2533 <SCH> 

A; Cross -references : EMBL:X96400; PIDN: CAA65264 . 1 

C; Genetics : 

A; Gene: alpha -5 ID 

A; Genetic code: SGC5 

A;Introns: 280/3; 538/2; 1248/2 

C; Superf amily : G surface protein 



Query Match 83 . 7%; 

Best Local Similarity 66.7%; 
Matches 6; Conservative 



Score 41; DB 2; 

Pred . No . 8.7; 
3 ; Mismatches 



Length 2533; 
0; Indels 



0 ; Gaps 



0; 



Qy 



Db 



1 CLSSRLDAC 9 

I : I : I : I I I 
575 CISNRVDAC 583 



RESULT 2 
T28674 

alpha-51D-immobilization antigen - Paramecium tetraurelia 
C; Species: Paramecium tetraurelia 

C;Date: 15-Oct-1999 #sequence_revis ion 15-Oct-1999 #text_change 17-Mar-2000 
C; Access ion: T2 8 674 
R; Schmidt, H.J. 

submitted to the EMBL Data Library, March 1995 
A; Reference number: Z2 0505 
A; Access ion: T2 8 674 

A;Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A;Residues: 1-2533 <SCH> 

A; Cross-references : EMBL:X85135; NID:g728634; PID:g728635; PIDN : CAA5 9447 . 1 

C;Genetics : 

A; Genetic code: SGC5 

A;Note: alpha-51D 

C; Superf amily : G surface protein 



Query Match 



83.7%; Score 41; DB 2; Length 2533; 



Best Local Similarity 66.7%; Pred. No. 8.7; 

Matches 6; Conservative 3; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 CLSSRLDAC 9 

hhhill 
Db 575 CISNRVDAC 583 



RESULT 3 
T31687 

suface antigen - Paramecium primaurelia 
C; Species: Paramecium primaurelia 

C;Date: 19-May-2000 #sequence_revision 19-May-2000 #text_change 23 -Mar-2001 
C; Access ion: T31687 

R; Bourgain-Guglielmetti , F. ; Caron, F. 

Journal of Eukaryot . Microbiol. 43, 303-314, 1996 

A; Title: Molecular characterization of the D surface protein gene subfamily in 
Paramecium primaurelia. 

A;Reference number: Z21061; MUID: 96313351 ; PMID: 8768434 
A; Access ion: T31687 
A; Status : preliminary 
A; Molecule type: DNA 
A;Residues: 1-2543 <BOU> 

A; Cross-references: EMBL:X96616; NID : gl235576 ; PIDN: CAA65436 . 1 

C; Genetics : 

A; Genetic code: SGC5 

C; Superfamily : G surface protein 



Query Match 83.7%; 
Best Local Similarity 66.7%; 
Matches 6; Conservative 



Score 41; DB 2; 
Pred. No. 8.7; 
3 ; Mismatches 



Length 2543 ; 
0; Indels 0; 



Gaps 



0; 



Qy 

Db 



1 CLSSRLDAC 9 

hhhill 
575 CISNRVDAC 583 



RESULT 4 
G82701 

hypothetical protein XF1273 [imported] - Xylella fastidiosa (strain 9a5c) 
C; Species: Xylella fastidiosa 

C;Date: 18-Aug-2000 #sequence_revision 20-Aug-2000 #text_change 20~Aug-2000 
C;Accession: G82701 

R ; anonymous , The Xylella fastidiosa Consortium of the Organization for 
Nucleotide Sequencing and Analysis, Sao Paulo, Brazil. 
Nature 406, 151-157, 2000 

A; Title: The genome sequence of the plant pathogen Xylella fastidiosa. 
A;Reference number: A82515; MUID: 20365717; PMID : 10910347 

A; Note: for a complete list of authors see reference number A59328 below 
A; Accession : G827 01 
A; Status : preliminary 
A; Molecule type: DNA 
A;Residues: 1-92 <SIM> 

A; Cross-references: GB;AE003961; GB:AE003849; NID:g9106254 ; PIDN : AAF84082 . 1 ; 

GSPDB:GN00128 ; XFSC:XF1273 

A; Experimental source: strain 9a5c 



R;Simpson, A.J.G.; Reinach, F.C.; Arruda, P.; Abreu, F.A.; Acencio, M. ; 
Alvarenga, R. ; Alves, L.M.C.; Araya, J.E.; Baia, G.S.; Baptista, C.S.; Barros, 



; Briones, 
Carrer, H 
Costa-Neto, C . M 
El-Dorry, H. ; 



M . R . S . ; Bueno , M. R. P . 
. ; Colauto, N.B.; 
, ; Cout inho , L . L . ; 

Facincani, A. P . ; 



; Fraga, J.S.; Franca, S.C.; Franco, 
M . ; Goldman, G.H.; Goldman, M.H.S.; 



M.H.; Bonaccorsi, E.D.; Bordin, S.; Bove, J.M. 
Camargo, A.A. ; Camargo, L.E.A.; Carraro, D.M.; 
Colombo, C. ; Costa, F.F.; Costa, M.C.R.; 
Cristofani, M . ; Dias-Neto, E.; Docena, C. 
Ferreira, A.J.S. 
submitted to GenBank, June 2 000 
A/Authors : Ferreira, V.C.A.; Ferro, J. A.; 
M.C.; Frohme, M . ; Furlan, L.R.; Gamier, 
Gomes, S.L.; Gruber, A.; Ho, P.L.; Hoheisel, J.D.; Junqueira, M.L.; Kemper, 
E.L.; Kitajima, J. P.; Krieger, J.E.; Kuramae, E.E.; Laigret, F. ; Lambais, M.R.; 
Leite, L.C.C.; Lemos, E.G.M.; Lemos, M.V.F.; Lopes, S.A.; Lopes, C.R.; Machado, 
J. A.; Machado, M.A.; Madeira, A.M.B.N.; Madeira, H.M.F.; Marino, C.L.; Marques, 
M.V.; Martins, E.A.L. 

A;Authors: Martins, E.M.F.; Matsukuma, 
Miyaki, C.Y.; Monteiro-Vitorello , C.B. 

A. L.T.O.; Netto, L.E.S.; Nhani Jr., A. 

M.A.; de Oliveira, M.C.; de Oliveira, R.C.; Palmieri, D.A. ; Paris, 

B. R.; Pereira, G.A.G.; Pereira Jr., H.A. ; Pesquero, J.B.; Quaggio, 
Roberto, P.G.; Rodrigues, V.; Rosa, A.J. de M. ; de Rosa Jr., V.E.; 
Sant el 1 i , R . V . ; Sawasaki , H . E . 
A; Authors: da Silva, A.C.R.; da Silva, F.R.; 
Silveira, J.F.; Silvestri, M.L.Z.; Siqueira, 
A. P.; Terenzi, M.F.; Truffi, D.; Tsai, S.M.; 

Sluys, M.A.; Verj ovski -Almeida , S.; Vettore, A . L . ; Zago, M.A.; Zatz, M . 
Meidanis, J.; Setubal, J.C. 
A; Reference number: A5 9328 
A; Contents : annotation 
C; Genetics : 
A;Gene: XF1273 

Query Match 75.5%; 
Best Local Similarity 87.5%; 
Matches 7; Conservative 



A.Y.; Menck, C.F.M.; Miracca, B.C.; 
Moon, D.H.; Nagai, M.A.; Nascimento, 
Nobrega, F.G.; Nunes, L.R.; Oliveira, 

A. ; Peixoto, 
R . B . ; 

de Sa, R.G. ; 



da Silva, A.M. , 
W.J. ; de Souza, 
Tsuhako , M . H . ; 



Silva Jr., W. A. ; 
A.A. ; de Souza, 
Vallada, H. ; Van 



Score 37; DB 2; 
Pred. No. 2.9; 
1; Mismatches 



Length 92; 
0; Indels 



0 ; Gaps 



0. 



Qy 

Db 



1 CLSSRLDA 8 
67 CLASRLDA 74 



RESULT 5 
F84115 

hypothetical protein BH3726 [imported] - Bacillus halodurans (strain C-125) 
C; Species: Bacillus halodurans 

C;Date: Ol-Dec-2000 #sequence_revision 01-Dec-2000 #text_change 15-Jun-2001 
C; Access ion: F84115 

R;Takami, H. ; Nakasone, K. ; Takaki, Y. ; Maeno, G. ; Sasaki, R.; Masui, N. ; Fuji 
F.; Hirama, C. ; Nakamura, Y. ; Ogasawara, N.; Kuhara, S.; Horikoshi, K. 
Nucleic Acids Res. 28, 4317-4331, 2000 

A;Title: Complete genome sequence of the alkaliphilic bacterium Bacillus 

halodurans and genomic sequence comparison with Bacillus subtilis. 

A;Reference number: A83650; MUID: 20512582 ; PMID : 11058132 

A; Access ion: F84115 

A; Status : preliminary 

A; Molecule type: DMA 

A;Residues: 1-152 <STO> 



A; Cross -references : GB:AP001519; GB:BA000004; NID : gl0176109 ; PIDN : BAB07445 . 1 ; 
GSPDB:GN00137 

A; Experimental source: strain C-125 
C;Genetics : 
A; Gene: BH3726 

Query Match 75.5%; Score 37; DB 2; Length 152; 

Best Local Similarity 77.8%; Pred. No. 4.6; 

Matches 7; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 
Qy 1 CLSSRLDAC 9 

Db 127 CLPSRLKAC 135 



RESULT 6 
S15624 

E2 protein - human papillomavirus type 57 
C;Species: human papillomavirus type 57 
A;Note: host Homo sapiens (man) 

C;Date: 17-Feb-1994 #sequence_revision 17-Feb-1994 #text_change 16-Jul-1999 
C; Access ion : SI 5624 

R;Hirsch-Behnam, A.; Delius, H. ; de Villiers, E.M. 
Virus Res. 18, 81-98, 1990 

A;Title: A comparative sequence analysis of two human papillomavirus (HPV) types 
2a and 57. 

A;Reference number: S15614; MUID: 91188699; PMID: 1964523 
A; Access ion: SI 5 624 
A; Molecule type: DNA 
A;Residues: 1-383 <HIR> 

A; Cross -references: EMBL:X55965; NID:g60882; PIDN : CAA3 9433 . 1 ; PID:g60886 
C; Superf amily : papillomavirus E2 protein 

C; Keywords: DNA binding; early protein; transcription regulation 

Query Match 75.5%; Score 37; DB 1; Length 383; 

Best Local Similarity 87.5%; Pred. No. 10; 

Matches 7; Conservative 1; Mismatches 0; Indels 0; Gaps 0; 



Qy 2 LSSRLDAC 9 

Db 4 LASRLDAC 11 



RESULT 7 
G95290 

hypothetical protein SMa0443 [imported] - Sinorhizobium meliloti (strain 1021) 

magaplasmid pSymA 

C;Species: Sinorhizobium meliloti 

C;Date: 24-Aug-2001 #sequence_revision 24-Aug-2001 #text_change 30-Sep-2001 
C;Accession: G95290 

R;Barnett, M.J.; Fisher, R.F.; Jones, T.; Komp, C; Abola, A. P.; Barloy-Hubler , 
F . ; Bowser, L. ; Capela, D. ; Galibert, F.; Gouzy, J.; Gurjal, M. ; Hong, A. ; 
Huizar, L.; Hyman, R.W. ; Kahn, D. ; Kahn, M.L.; Kalman, S.; Keating, D.H. ; Palm, 
C; Peck, M.C.; Surzycki, R. ; Wells, D.H.; Yeh, K.C.; Davis, R.W.; Federspiel, 
N.A.; Long, S.R. 

Proc. Natl. Acad. Sci. U.S.A. 98, 9883-9888, 2001 



A; Title: Nucleotide sequence and predicted functions of the entire Sinorhizobium 
meliloti pSymA megaplasmid. 

A;Reference number: A95262; MUID : 21396509 ; PMID : 1 14 81432 
A;Accession: G95290 
A; Status : preliminary 
A; Molecule type: DNA 
A;Residues: 1-153 <KUR> 

A; Cross-references: GB:AE006469; PIDN : AAK64889 . 1; PID:gl4523307 ; GSPDB : GN00165 
A; Experimental source: strain 1021, megaplasmid pSymA 

R;Galibert, F . ; Finan, T.M. ; Long, S.R.; Punier, A.; Abola, P.; Ampe, F. ; 
Barloy-Hubler, F.; Barnett, M.J.; Becker, A.; Boistard, P.; Bothe, G. ; Boutry, 
M.; Bowser, L. ; Buhrmester, J.; Cadieu, E. ; Capela, D. ; Chain, P.; Cowie, A.; 

; Dreano, S.; Federspiel, N.A.; Fisher, R.F.; Gloux, S.; Godrie, T. ; 
} Golding, B. ; Gouzy, J. ; Gurjal, M. ; Hernandez -Lucas , I . ; Hong, A. ; 
Hyman, R.W. ; Jones, T. 
668-672, 2001 

A;Authors: Kahn, D. ; Kahn, M.L.; Kalman, S.; Keating, D.H. 
Lelaure, V. ; Masuy, D.; Palm, C. ; Peck, M.C.; Pohl, T.M. ; 
Purnelle, B.; Ramsperger, U. ; Surzycki, R . ; Thebault, P.; 
Vorholter, F.J.; Weidner, S.; Wells, D.H.; Wong, K. ; Yeh, 

A; Title: The composite genome of the legume symbiont Sinorhizobium meliloti. 
A;Reference number: A96039; MUID: 21368234 ; PMID : 11474104 
A; Contents : annotation 
C;Genetics : 
A; Gene: SMa0443 
A; Genome : plasmid 



Davis, R.W. 
Goffeau, A. 
Huizar, L. ; 
Science 293 , 



Kiss , E . ; Komp, 
Portetelle, D. ; 
Vandenbol , M . ; 
K.C.; Batut, J. 



Query Match 71.4%; 
Best Local Similarity 77.8%; 
Matches 7; Conservative 



Score 35; DB 2; 
Pred. No. 11; 
0; Mismatches 



Length 153; 
2; Indels 



0 ; Gaps 



0; 



Qy 

Db 



1 CLSSRLDAC 9 

II III II 
138 CLPSRLMAC 14 6 



RESULT 8 
T21405 

hypothetical protein F26D2.12 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 15-Oct-1999 
C; Access ion: T214 05 
R;McMurray, A. 

submitted to the EMBL Data Library, November 19 96 
A;Reference number: Z19418 
A; Access ion: T214 05 

A;Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A;Residues: 1-161 <WIL> 

A; Cross-references: EMBL:Z81513; PIDN : CAB04 182 . 1 ; GSPDB : GN00023 ; CESP : F26D2 . 12 

A; Experimental source: clone F26D2 

C;Genetics : 

A; Gene: CESP : F26D2 . 12 

A; Map position: 5 

A;Introns: 24/1; 43/3; 79/1; 145/2 



Query Match 



71.4%; Score 35; DB 2; Length 161; 



Best Local Similarity 87.5%; Pred. No. 12; 

Matches 7; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 



1 CLSSRLDA 8 



Db 




RESULT 9 
S55641 

uracil DNA glycosylase - equine herpesvirus 2 
C; Species: equine herpesvirus 2 

C;Date: 27-0ct-1995 #sequence_revision 03-Nov-1995 #text_change 22-Jun-1999 
C; Access ion : S55641 

R;Telford, E.A.R.; Watson, M.S.; Aird, H.C.; Perry, J.; Davison, A.J. 
J. Mol. Biol. 249, 520-528, 1995 

A; Title: The DNA sequence of equine herpesvirus 2. 
A;Reference number: S55594; MUID: 95302501 ; PMID:7783207 
A; Access ion: S55641 

A;Status: preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A;ResidueS: 1-255 <TEL> 

A; Cross-references : GB:U20824; NID:g695172; PIDN : AAC13834 , 1 ; PID:g695219 

A;Note: the nucleotide sequence was submitted to the EMBL Data Library, February 

1995 

C; Superf amily : uracil -DNA glycosylase 

Query Match 71.4%; Score 35; DB 2; Length 2 55; 

Best Local Similarity 66.7%; Pred. No. 18; 

Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 0; 
Qy 1 CLSSRLDAC 9 



RESULT 10 
F72616 

hypothetical protein APE1391 - Aeropyrum pernix (strain Kl) 
C;Species: Aeropyrum pernix 

C;Date: 20-Aug-1999 #sequence_revision 20-Aug-1999 #text_change 20-Jun-2000 
C; Access ion: F72616 

R;Kawarabayasi, Y.; Hino, Y. ; Horikawa, H. ; Yamazaki, S.; Haikawa, Y. ; Jin-no, 
K. ; Takahashi, M. ; Sekine, M.; Baba, S.; Ankai, A.; Kosugi, H.; Hosoyama, A.; 
Fukui, S.; Nagai, Y. ; Nishijima, K. ; Nakazawa, H. ; Takamiya, M. ; Masuda, S.; 
Funahashi, T. ; Tanaka, T. ; Kudoh, Y. ; Yamazaki, J.; Kushida, N, ; Oguchi, A. ; 
Aoki, K. ; Kubota, K. ; Nakamura, Y. ; Nomura, N.; Sako, Y.; Kikuchi, H. 
DNA Res. 6, 83-101, 1999 

A;Title: Complete genome sequence of an aerobic hyper- thermophilic Crenarchaeon, 
Aeropyrum pernix Kl . 

A;Reference number: A72450; MUID: 99310339 ; PMID: 10382966 
A; Access ion: F72616 
A; Status : preliminary 
A; Molecule type: DNA 
A;ResidueS: 1-258 <KAW> 

A;Cross-ref erences : DDBJ:AP000061; NID : g5104821 ; PIDN: BAA8 0388 . 1 ; PID:g5105074 
A; Experimental source: strain Kl 



Db 




C;Genetics : 
A;Gene: APE1391 

C;Superfamily: Aeropyrum pernix hypothetical protein APE1391 

Query Match 71.4%; Score 35; DB 2; Length 258; 

Best Local Similarity 66.7%; Pred. No. 18; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 CLSSRLDAC 9 

III I I I 
Db 223 CLSGRLSTC 231 



RESULT 11 
G87332 

hypothetical protein CC0674 [imported] - Caulobacter crescentus 
C; Species: Caulobacter crescentus 

C;Date: 20-Apr-2001 #sequence_revision 20-Apr-2001 #text_change 20-Apr-2001 
C; Access ion: G87332 

R;Nierman, W.C.; Feldblyum, T.V. ; Paulsen, I.T.; Nelson, K.E.; Eisen, J. ; 
Heidelberg, J.F.; Alley, M. ; Ohta, N. ; Maddock, J.R.; Potocka, I.; Nelson, W 
Newton, A. ; Stephens, C. ; Phadke, N.D.; Ely, B. ; Laub, M.T.; DeBoy, R.T.; 
Dodson, R.J.; Durkin, A.S.; Gwinn, M.L.; Haft, D.H. ; Kolonay, J.F.; Smit, J. 
Craven, M . ; Khouri, H. ; Shetty, J. ; Berry, K. ; Utterback, T. 
A.; Vamathevan, J.; Ermolaeva, M. ; White, 0.; Salzberg, S.L. 
Venter, J.C.; Fraser, CM. 

Proc. Natl. Acad. Sci. U.S.A. 98, 4136-4141, 2001 
A; Title: Complete Genome Sequence of Caulobacter crescentus. 
A;Reference number: A87249; MUID: 21173698 ; PMID: 11259647 
A; Access ion: G87332 
A;Status: preliminary 
A;Molecule type: DNA 
A;Residues: 1-389 <STO> 

A; Cross-references: GB:AE005673; NID: gl3421893 ; 
C;Genetics : 
A;Gene: CC0674 



Tran, K. ; Wolf, 
Shapiro, L. ; 



PIDN:AAK22659.1; GSPDB :GN00148 



Query Match 71.4%; 
Best Local Similarity 66.7%; 
Matches 6; Conservative 



Score 35; DB 2; 
Pred. No. 26; 
0; Mismatches 



Length 389; 
3; Indels 



0 ; Gaps 



0; 



Qy 

Db 



1 CLSSRLDAC 9 

II I III 
44 CLPGRADAC 52 



RESULT 12 
F69157 

excinuclease ABC chain C - Methanobacterium thermoautotrophicum (strain Delta H) 
C; Species: Methanobacterium thermoautotrophicum 

C;Date: 05-Dec-1997 #sequence_revision 05-Dec-1997 #text_change 20-Apr-2000 
C; Accession: F69157 

R;Smith, D.R.; Doucette-Stamm, L.A. ; Deloughery, C; Lee, H. ; Dubois, J. ; 
Aldredge, T. ; Bashirzadeh, R. ; Blakely, D. ; Cook, R. ; Gilbert, K. ; Harrison, D. ; 
Hoang, L. ; Keagle, P.; Lumm, W. ; Pothier, B. ; Qiu, D.; Spadafora, R. ; Vicaire, 
R w - Wang, Y. ; Wierzbowski, J.; Gibson, R. ; Jiwani, N. ; Caruso, A.; Bush, D.; 
Safer, H.; Patwell, D. ; Prabhakar, S.; McDougall, S. ; Shimer, G. ; Goyal, A.; 



Pietrokovski, S.; Church, G.M.; Daniels, C.J. ; Mao, J.; Rice, P.; Noelling, J.; 
Reeve, J.N. 

J. Bacterid . 179, 7135-7155, 1997 

A; Title: Complete genome sequence of Methanobacterium thermoautotrophicum Delta 

H: functional analysis and comparative genomics. 

A /Reference number: A69000; MUID : 98037514 ; PMID : 9371463 

A, -Access ion: F69157 

A; Status : preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A;Residues: 1-579 <MTH> 

A; Cross-references : GB : AE000828 ; GB:AE000666; NID : g2621504 ; PIDN:AAB84947 1* 
PID:g2621507 

A; Experimental source: strain Delta H 

C,-Genetics : 

A; Gene: MTH441 

A; Start codon: TTG 

C;Superfamily: excinuclease ABC chain C 

Query Match 71.4%; Score 35; DB 2; Length 57 9; 

Best Local Similarity 55.6%; Pred. No. 37; 

Matches 5; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 CLSSRLDAC 9 

ihh:| I 
Db 156 CLNSQIDLC 164 



RESULT 13 
S36535 

E2 protein - human papillomavirus type 10 
C; Species: human papillomavirus type 10 

C;Date: 20-Feb-1995 #sequence_revision 20-Feb-1995 #text__change 26-Aug-1999 
C; Access ion: S3 653 5 
R;Delius, H.; Hofmann, B. 

submitted to the EMBL Data Library, August 1993 

A;Description: Primer-directed sequencing of human papillomavirus types. 
A; Reference number: S3 64 6 9 
A; Access ion: S3 653 5 
A; Molecule type: DNA 
A;Residues: 1-376 <DEL> 

A; Cross -references : EMBL:X74465; NID:g396901; PIDN : CAA524 92 . 1 ; PID:g396905 
C;Superfamily : papillomavirus E2 protein 

C;Keywords: DNA binding; early protein; transcription regulation 

Query Match 69.4%; Score 34; DB 2; Length 376; 

Best Local Similarity 75.0%; Pred. No. 40; 

Matches 6; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 

Qy 2 LSSRLDAC 9 

-III, 

Db 4 LANRLDAC 11 



RESULT 14 
S36500 

E2 protein - human papillomavirus type 27 
C; Species: human papillomavirus type 27 



C;Date: 20-Feb-1995 #sequence_revision 20-Feb-1995 #text__change 26-Aug~1999 
C; Access ion: S3 65 00 
R;Delius, H.; Hofmann, B. 

submitted to the EMBL Data Library, August 1993 

A; Description: Primer-directed sequencing of human papillomavirus types. 
A; Reference number: S36469 
A; Access ion: S3 65 00 
A; Molecule type: DNA 
A/Residues : 1-388 <DEL> 

A; Cross-references: EMBL:X74473 ; NID:g396964; PIDN : CAA5253 9 . 1 ; PID:g396968 
C; Superf amily : papillomavirus E2 protein 

C;Keywords: DNA binding; early protein; transcription regulation 

Query Match 69.4%; Score 34; DB 2; Length 388; 

Best Local Similarity 75.0%; Pred. No. 41; 

Matches 6; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 
Qy 2 LSSRLDAC 9 

Db 4 LANRLDAC 11 



RESULT 15 
S15617 

E2 protein - human papillomavirus type 2a 
C; Species: human papillomavirus type 2a 
A; Note: host Homo sapiens (man) 

C;Date: 17-Feb-1994 #sequence_revision 17-Feb-1994 #text_change 16-Feb~1997 
C; Access ion: S15617 

R;Hirsch-Behnam, A,; Delius, H.; de Villiers, E.M. 
Virus Res. 18, 81-98, 1990 

A; Title: A comparative sequence analysis of two human papillomavirus (HPV) types 
2a and 57. 

A;Reference number: S15614; MUID: 91188699; PMID: 1964523 

A; Access ion: SI 56 17 

A; Molecule type: DNA 

A;Residues: 1-391 <HIR> 

A; Cross-references : EMBL:X55964 

C; Superf amily : papillomavirus E2 protein 

C; Keywords: DNA binding; early protein; transcription regulation 

Query Match 69.4%; Score 34; DB 1; Length 391; 

Best Local Similarity 75.0%; Pred. No. 42; 

Matches 6; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 

Qy 2 LSSRLDAC 9 

|::|IMI 
Db 4 LANRLDAC 11 



Search completed: November 13, 2003, 09:52:52 
Job time : 11.375 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2 003 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 



November 13, 2003, 09:31:40 ; Search time 5.15625 Seconds 

(without alignments) 
82.083 Million cell updates/sec 

US-09-228-866-3 
49 

1 CLSSRLDAC 9 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 127863 seqs, 47026705 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



127863 



Database 



SwissProt 41:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result 



Query 



No. 


Score 


Match Length DB 


ID 


Description 


1 


37 


75, 


.5 


383 


1 


VE2_HPV57 


P22155 


human papil 


2 


36 


73. 


.5 


369 


1 


VE2_HPV66 


Q80958 


human papil 


3 


35 


71. 


.4 


255 


1 


UNG_HSVE2 


P53765 


equine herp 


4 


35 


71. 


.4 


579 


1 


UVRC METTH 


026541 


methanobact 


5 


34 


69. 


.4 


376 


1 


VE2_HPV10 


P36781 


human papil 


6 


34 


69. 


.4 


388 


1 


VE2 HPV27 


P36789 


human papil 


7 


34 


69. 


.4 


388 


1 


VE2_HPV29 


P50772 


human papil 


8 


34 


69. 


.4 


391 


1 


VE2_HPV2A 


P25482 


human papil 


9 


34 


69, 


.4 


457 


1 


DBDR_XENLA 


P42290 


xenopus lae 


10 


34 


69, 


.4 


1147 


1 


CARB_HUMAN 


Q9bxl7 


homo sap i en 


11 


33 


67, 


.3 


247 


1 


PS PA CAVPO 


P50403 


cavia porce 


12 


33 


67, 


.3 


305 


1 


GP7D_CHLTR 


P10561 


chlamydia t 


13 


33 


67, 


.3 


382 


1 


VE2_HPV61 


Q80951 


human papil 


14 


33 


67, 


.3 


394 


1 


VE2_HPV32 


P36791 


human papil 


15 


33 


67, 


.3 


398 


1 


VE2 HPV42 


P27223 


human papil 


16 


33 


67, 


.3 


449 


1 


EF1C_P0RPU 


P50256 


porphyra pu 


17 


33 


67. 


.3 


948 


1 


RCSC SALT I 


Q56128 


salmonella 


18 


33 


67 


.3 


948 


1 


RCSC SALTY 


P58662 


salmonella 


19 


33 


67 


.3 


949 


1 


RCSC_ECOLI 


P14376 


escherichia 


20 


32 


65 


.3 


180 


1 


RGS8_HUMAN 


P57771 


homo sapien 


21 


32 


65 


.3 


180 


1 


RGS8_RAT 


P49804 


rattus norv 


22 


32 


65 


.3 


370 


1 


PSPB_RABIT 


P15285 


oryctolagus 



23 


32 


65. 


,3 


852 


1 


RA54 SCHPO 


P41410 


schizosacch 


24 


31.5 


64. 


.3 


134 


1 


FOLB_CHLPN 


Q9z7e9 


chlamydia p 


25 


31 


63. 


,3 


78 


1 


IBB2_PHAAN 


P01061 


phaseolus a 


26 


31 


63. 


.3 


83 


1 


I BB_PHALU 


P01056 


phaseolus 1 


27 


31 


63. 


.3 


111 


1 


UL91 HCMVA 


P16797 


human cytom 


28 


31 


63. 


.3 


248 


1 


PS PA HUMAN 


P07714 


homo sapien 


29 


31 


63. 


.3 


367 


1 


VE2_HPV11 


P04015 


human papil 


30 


31 


63, 


.3 


368 


1 


VE2_HPV6A 


Q84294 


human papil 


31 


31 


63, 


.3 


368 


1 


VE2_HPV6B 


P03119 


human papil 


32 


31 


63 . 


.3 


378 


1 


VE2_HPV30 


P36790 


human papil 


33 


31 


63, 


.3 


384 


1 


VE2_HPV53 


P36797 


human papil 


34 


31 


63, 


.3 


391 


1 


PCL_ECTHA 


P42516 


ectothiorho 


35 


31 


63 


.3 


398 


1 


DXR_YERPE 


Q8zh62 


yersinia pe 


36 


31 


63 


,3 


401 


1 


DXR VIBPA 


Q87me3 


vibrio para 


37 


31 


63 


.3 


402 


1 


DXR VIBCH 


Q9kpv8 


vibrio chol 


38 


31 


63 


.3 


433 


1 


THIC_FUSNN 


Q8ri60 


fusobacteri 


39 


31 


63 


.3 


491 


1 


G6PD_ECOLI 


P22992 


escherichia 


40 


31 


63 


.3 


491 


1 


G6PD_ERWCH 


P37986 


erwinia chr 


41 


31 


63 


.3 


498 


1 


KPYK TRYBO 


Q27788 


trypanoplas 


42 


31 


63 


.3 


602 


1 


YHOHJ3CHPO 


094364 


schizosacch 


43 


31 


63 


.3 


702 


1 


AT 11 VARV 


P34011 


variola vir 


44 


31 


63 


.3 


724 


1 


ATI1 VACCV 


P24759 


vaccinia vi 


45 


31 


63 


.3 


726 


1 


ATI CAMPC 


Q05482 


camelpox vi 



ALIGNMENTS 



RESULT 1 
VE2_HPV57 

ID VE2__HPV57 STANDARD; PRT; 383 AA . 

AC P22155; 

DT 01-AUG-1991 (Rel. 19, Created) 

DT 01-AUG-1991 (Rel. 19, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Regulatory protein E2 . 

GN E2 . 

OS Human papillomavirus type 57. 

OC Viruses; dsDNA viruses, no RNA stage; Papillomaviridae; 

OC Papillomavirus . 

OX NCBI_TaxID=10597 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=91188699; PubMed=1964 523 ; 

RA Hirsch-Behnam A., Delius H. , de Villiers E.M.; 

RT "A comparative sequence analysis of two human papillomavirus (HPV) 

RT types 2a and 57."; 

RL Virus Res. 18:81-98(1990). 

CC -!- FUNCTION: E2 REGULATES VIRAL TRANSCRIPTION AND DNA REPLICATION. 

CC IT BINDS TO THE E2RE RESPONSE ELEMENT ( 5 1 -ACCNNNNNNGGT-3 ' ) PRESENT 

CC IN MULTIPLE COPIES IN THE REGULATORY REGION. IT CAN EITHER 

CC ACTIVATE OR REPRESS TRANSCRIPTION DEPENDING OF E2RE'S POSITION 

CC WITH REGARDS TO PROXIMAL PROMOTER ELEMENTS. REPRESSION OCCURS 

CC BY STERICALLY HINDERING THE ASSEMBLY OF THE TRANSCRIPTION 

CC INITIATION COMPLEX. THE E1-E2 COMPLEX BINDS TO THE ORIGIN OF DNA 

CC REPLICATION. 

CC -!- SUBUNIT: Binds DNA as a dimer. 



CC -!- SUBCELLULAR LOCATION: Nuclear. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; X55965; CAA39433.1; 

DR PIR; S15624; S15624. 

DR HSSP; P17383; 1DHM. 

DR InterPro; IPR000427; E2_C. 

DR InterPro; IPR001866; E2_N. 

DR Pfam; PF00511; E2__C; 1. 

DR Pfam; PF00508; E2_N; 1. 

DR ProDom; PD000672; E2_C; 1. 

DR ProDom; PD000678; E2_N; 1. 

KW Early protein; Transcription regulation; Activator; DNA-binding; 

KW Trans-acting factor; DNA replication; Repressor; Nuclear protein. 

SQ SEQUENCE 383 AA; 42829 MW; 7F2 014 6677D7AAEC CRC64 ; 

Query Match 75.5%; Score 37; DB 1; Length 383; 

Best Local Similarity 87.5%; Pred. No. 4.2; 

Matches 7; Conservative 1; Mismatches 0; Indels 0; Gaps 0; 

Qy 2 LSSRLDAC 9 

hllllll 
Db 4 LASRLDAC 11 



RESULT 2 
VE2 HPV66 



ID VE2_HPV66 STANDARD; PRT; 369 AA, 

AC Q80958; 

DT 15-JUL-1998 (Rel . 36, Created) 

DT 15-JUL-1998 (Rel. 36, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Regulatory protein E2 . 

GN E2 , 

OS Human papillomavirus type 66. 

OC Viruses; dsDNA viruses, no RNA stage; Papillomaviridae; 

OC Pap ill omav i rus . 

OX NCBI_TaxID=3 7119; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Delius H. ; 

RL Submitted (OCT-1995) to the EMBL/ GenBank / DDB J databases. 

CC -!- FUNCTION: E2 REGULATES VIRAL TRANSCRIPTION AND DNA REPLICATION. 

CC IT BINDS TO THE E2RE RESPONSE ELEMENT (5 ' -ACCNNNNNNGGT-3 ' ) PRESENT 

CC IN MULTIPLE COPIES IN THE REGULATORY REGION. IT CAN EITHER 

CC ACTIVATE OR REPRESS TRANSCRIPTION DEPENDING OF E2RE ' S POSITION 

CC WITH REGARDS TO PROXIMAL PROMOTER ELEMENTS. REPRESSION OCCURS 

CC BY STERICALLY HINDERING THE ASSEMBLY OF THE TRANSCRIPTION 

CC INITIATION COMPLEX. THE E1-E2 COMPLEX BINDS TO THE ORIGIN OF DNA 

CC REPLICATION. 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
KW 
SQ 



■!- SUBUNIT: Binds DNA as a dimer. 
!- SUBCELLULAR LOCATION: Nuclear. 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormat ics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib. ch) . 



EMBL; U31794; AAA79502.1; -. 
HSSP; P17383; 1DHM . 
InterPro; IPR000427; 
InterPro; IPR001866; 
Pfam; PF00511; E2_C; 
Pfam; PF00508; E2_N; 
ProDom; PD000672; E2_ 



E2_C. 
E2_N. 
1 . 
1, 

1. 
1. 

Early protein; Transcription regulation; Activator; DNA-binding; 
Trans-acting factor; DNA replication; Repressor; Nuclear protein. 
SEQUENCE 369 AA; 42781 MW; E90F265AEC3 97A14 CRC64; 



ProDom; PD000678; E2_N 



Query Match 73.5%; Score 36; DB 1; 

Best Local Similarity 87.5%; Pred. No. 6.4; 
Matches 7; Conservative 0; Mismatches 



Length 369; 



1; Indels 



0 ; Gaps 



0; 



Qy 

Db 



2 LSSRLDAC 9 

II Mill 
4 LSQRLDAC 11 



RESULT 3 
UNG_HSVE2 

ID UNG_HSVE2 STANDARD; PRT; 25 5 AA. 

AC P53765; 

DT 01-OCT-1996 {Rel. 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Uracil-DNA glycosylase (EC 3.2.2.-) (UDG) . 

GN 46. 

OS Equine herpesvirus type 2 (strain 86/87) (EHV-2) . 

0C Viruses; dsDNA viruses, no RNA stage; Herpesviridae; 

OC Gammaherpesvirinae . 

OX NCBI_TaxID=82831; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=95302501; PubMed=7783207 ; 

RA Telford E.A., Watson M.S., Aird H.C., Perry J . , Davison A. J.; 

RT "The DNA sequence of equine herpesvirus 2."; 

RL J. Mol. Biol. 249:520-528(1995). 

CC -!- FUNCTION: EXCISES URACIL RESIDUES FROM THE DNA WHICH CAN ARISE 

CC AS A RESULT OF MIS INCORPORATION OF DUMP RESIDUES BY DNA 

CC POLYMERASE OR DUE TO DEAMINATION OF CYTOSINE. 

CC -!- SIMILARITY: BELONGS TO THE URACIL-DNA GLYCOSYLASE FAMILY. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 



CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U20824; AAC13834.1; -. 

DR PIR; S55641; S55641. 

DR HSSP; P122 95; 3EUG . 

DR InterPro; IPR003249; U_glycsylse_notp . 

DR InterPro; IPR002043; UDNA_glycsylse . 

DR InterPro; IPR005122; UDNA_glycsylseSF . 

DR Pfam; PF03167; UDG; 1. 

DR ProDom; PD001589; U_glycsylse_notp; 1. 

DR TIGRFAMs ; TIGR00628; ung; 1. 

DR PROSITE; PS00130; U_DNA_GLYCOSYLASE; 1. 

KW DNA repair; Hydrolase; Glycosidase. 

FT ACTJ3ITE 90 90 GENERAL BASE (BY SIMILARITY) . 

SQ SEQUENCE 255 AA; 29099 MW; 2 0 1044 02C52 97336 CRC64 ; 

Query Match 71.4%; Score 35; DB 1; Length 255; 

Best Local Similarity 66.7%; Pred. No. 6.9; 

Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 CLSSRLDAC 9 

Mh il I 
Db 176 CLSNELDHC 184 



RESULT 4 
UVRCJYIETTH 

ID UVRC_METTH STANDARD; PRT; 579 AA. 

AC 026541; 

DT 30-MAY-2000 (Rel . 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE UvrABC system protein C (UvrC protein) (Excinuclease ABC subunit C) . 

GN UVRC OR MTH441. 

OS Methanobacterium thermoautotrophicum. 

OC Archaea; Euryarchaeota ; Methanobacteria; Methanobacteriales ; 

OC Methanobacteriaceae; Methanothermobacter . 

OX NCBI_TaxID=18742 0; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-Delta H; 

RX MEDLINE=98037514 ; PubMed=9371463 ; 

RA Smith D.R., Doucette-Stamm L.A. , Deloughery C, Lee H.-M., Dubois J., 

RA Aldredge T., Bashirzadeh R. , Blakely D . , Cook R. , Gilbert K. , 

RA Harrison D., Hoang L., Keagle P., Lumm W., Pothier B. , Qiu D., 

RA Spadafora R., Vicare R. , Wang Y., Wierzbowski J., Gibson R. , 

RA Jiwani N., Caruso A., Bush D. , Safer H., Patwell D. , Prabhakar S., 

RA McDougall S., Shimer G. , Goyal A., Pietrovski S., Church G.M., 

RA Daniels C.J., Mao J. -I., Rice P., Noelling J., Reeve J.N.; 

RT "Complete genome sequence of Methanobacterium thermoautotrophicum 

RT deltaH: functional analysis and comparative genomics. "; 

RL J. Bacterid. 179:7135-7155(1997). 



CC -!- FUNCTION: The UvrABC repair system catalyzes the recognition and 
CC processing of DNA lesions. UvrC both incises the 5' and 3' sides 

CC of the lesion. The N-terminal half is responsible for the 3' 

CC incision and the C-terminal half is responsible for the 5' 

CC incision (By similarity) . 

CC -!- SUBUNIT: Interacts with uvrB in an incision complex (By 
CC similarity) . 

CC -!- SUBCELLULAR LOCATION: Cytoplasmic (By similarity). 

CC -!- SIMILARITY : Belongs to the uvrC family. 

CC -!- SIMILARITY: Contains 1 UVR domain. 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; AE000828; AAB84947.1; -. 

DR PIR; F69157; F69157. 

DR HSSP; P07025; 1E52 . 

DR HAMAP; MF_002 03; -; 1. 

DR InterPro; IPR000445; HhH. 

DR InterPro; IPR003583; HHH_1 . 

DR InterPro; IPR001943; UvrB/C. 

DR InterPro; IPR004791; UvrC. 

DR InterPro; IPR001162; UvrC_C. 

DR InterPro; I PRO 003 05; UvrCJSL 

DR Pfam; PF01541; Exci_endo_N; 1. 

DR Pfam; PF00633; HHH; 2. 

DR Pfam; PF02151; UVR; 1. 

DR ProDom; PD00587 0; UvrC_C; 1. 

DR SMART; SM00465; GIYc; 1. 

DR SMART; SM00278; HhHl; 2. 

DR TIGRFAMs; TIGR00194; uvrC; 1. 

DR PROSITE; PS50151; UVR; 1. 

DR PROSITE; PS50164; UVRC_1 ; 1. 

DR PROSITE; PS50165; UVRC_2 ; 1. 

KW SOS response; Excision nuclease; DNA repair; DNA recombination; 

KW DNA excision; Complete proteome. 

FT DOMAIN 193 228 UVR. 

SQ SEQUENCE 579 AA; 66293 MW; 83D3DF7B8F9E3A68 CRC64 ; 

Query Match 71.4%; Score 35; DB 1; Length 579; 

Best Local Similarity 55.6%; Pred. No. 16; 

Matches 5 ; Conservative 3 ; Mismatches 1 ; Indels 0 ; Gaps 0 ; 
Qy 1 CLSSRLDAC 9 

Db 156 CLNSQIDLC 164 

RESULT 5 
VE2_HPV10 

ID VE2_HPV10 STANDARD; PRT; 376 AA. 

AC P36781; 



DT 01-JUN-1994 (Rel . 29, Created) 

DT 01-JUN-1994 (Rel. 29, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Regulatory protein E2 . 

GN E2 . 

OS Human papillomavirus type 10. 

OC Viruses; dsDNA viruses, no RNA stage; Papillornaviridae; 

OC Papillomavirus . 

OX NCBI_TaxID=10603 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=94265501; PubMed=8205838 ; 

RA Delius H., Hofmann B. ; 

RT "Primer-directed sequencing of human papillomavirus types."; 

RL Curr. Top, Microbiol. Immunol. 186:13-31(1994). 

CC -!- FUNCTION: E2 REGULATES VIRAL TRANSCRIPTION AND DNA REPLICATION. 

CC IT BINDS TO THE E2RE RESPONSE ELEMENT ( 5 ' -ACCNNNNNNGGT-3 1 ) PRESENT 

CC IN MULTIPLE COPIES IN THE REGULATORY REGION . IT CAN EITHER 

CC ACTIVATE OR REPRESS TRANSCRIPTION DEPENDING OF E2RE ' S POSITION 

CC WITH REGARDS TO PROXIMAL PROMOTER ELEMENTS . REPRESSION OCCURS 

CC BY STERICALLY HINDERING THE ASSEMBLY OF THE TRANSCRIPTION 

CC INITIATION COMPLEX. THE E1-E2 COMPLEX BINDS TO THE ORIGIN OF DNA 

CC REPLICATION. 

CC -!- SUBUNIT: Binds DNA as a dimer. 

CC -!- SUBCELLULAR LOCATION: Nuclear. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormat ics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; X74465; CAA52492.1; -. 

DR PIR; S36535; S36535. 

DR HSSP; P17383; 1DHM. 

DR InterPro; IPR000427; E2_C. 

DR InterPro; IPR001866; E2JST. 

DR Pfam; PF00511; E2_C; 1. 

DR Pfam; PF00508; E2_N; 1. 

DR ProDom; PD000672; E2___C; 1. 

DR ProDom; PD000678; E2_N; 1. 

KW Early protein; Transcription regulation; Activator; DNA-binding; 

KW Trans-acting factor; DNA replication; Repressor; Nuclear protein. 

SQ SEQUENCE 376 AA; 43003 MW; 916B14B7FC51D7D1 CRC64 ; 

Query Match 69.4%; Score 34; DB 1; Length 376; 
Best Local Similarity 75.0%; Pred. No. 17; 

Matches 6; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 

Qy 2 LSSRLDAC 9 

Db 4 LANRLDAC 11 



RESULT 6 



VE2 HPV27 



ID VE2_HPV27 STANDARD; PRT; 3 88 AA. 

AC P36789; 

DT 01-JUN-1994 (Rel . 29, Created) 

DT 01-JUN-1994 (Rel. 29, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Regulatory protein E2 . 

GN E2. 

OS Human papillomavirus type 27. 

OC Viruses; dsDNA viruses, no RNA stage; Papillomaviridae; 

OC Papillomavirus. 

OX NCBIJTaxI D=3 1550; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=94265501; PubMed=82 05838 ; 

RA Delius H., Hofmann B.; 

RT "Primer-directed sequencing of human papillomavirus types."; 

RL Curr. Top. Microbiol. Immunol. 186:13-31(1994). 

CC -!- FUNCTION: E2 REGULATES VIRAL TRANSCRIPTION AND DNA REPLICATION. 

CC IT BINDS TO THE E2RE RESPONSE ELEMENT (5 ' -ACCNNNNNNGGT-3 ' ) PRESENT 

CC IN MULTIPLE COPIES IN THE REGULATORY REGION. IT CAN EITHER 

CC ACTIVATE OR REPRESS TRANSCRIPTION DEPENDING OF E2RE 1 S POSITION 

CC WITH REGARDS TO PROXIMAL PROMOTER ELEMENTS. REPRESSION OCCURS 

CC BY STERICALLY HINDERING THE ASSEMBLY OF THE TRANSCRIPTION 

CC INITIATION COMPLEX. THE E1-E2 COMPLEX BINDS TO THE ORIGIN OF DNA 

CC REPLICATION. 

CC -!- SUBUNIT: Binds DNA as a dimer. 

CC -!- SUBCELLULAR LOCATION: Nuclear. 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X74473; CAA5253 9.1; -. 

DR PIR; S36500; S36500. 

DR HSSP; P17383; 1DHM. 

DR InterPro; IPR000427; E2_C. 

DR InterPro; IPR001866; E2_N. 

DR Pfam; PF00511; E2_C; 1. 

DR Pfam; PF00508; E2_N; 1. 

DR ProDom; PD00 0672; E2_C; 1. 

DR ProDom; PD000678; E2JST; 1. 

KW Early protein; Transcription regulation; Activator; DNA-binding; 

KW Trans-acting factor; DNA replication; Repressor; Nuclear protein. 

SQ SEQUENCE 388 AA; 43297 MW; 1C2740BA5C2 C8 73B CRC64 ; 

Query Match 69.4%; Score 34; DB 1; Length 388; 

Best Local Similarity 75.0%; Pred. No. 17; 

Matches 6; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 

Qy 2 LSSRLDAC 9 

|::||||| 
Db 4 LANRLDAC 11 



RESULT 7 
VE2_HPV2 9 

ID VE2_HPV2 9 STANDARD; PRT; 3 88 AA. 

AC P50772; 

DT 01-OCT-1996 (Rel. 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 01-OCT-1996 (Rel. 34, Last annotation update) 

DE Regulatory protein E2 . 

GN E2. 

OS Human papillomavirus type 29. 

OC Viruses; dsDNA viruses, no RNA stage; Papillomaviridae; 

OC Papillomavirus . 

OX NCBI_TaxID=37112 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Delius H. ; 

RL Submitted (OCT- 1995) to the EMBL/ GenBank / DDB J databases. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U31784; AAA79432.1; -. 

DR HSSP; P17383; 1DHM. 

DR InterPro; IPR000427; E2_C. 

DR InterPro; IPR001866; E2_N. 

DR Pfam; PF00511; E2_C; 1. 

DR Pfam; PF00508; E2_N; 1. 

DR ProDom; PDO 00672; E2_C; 1. 

DR ProDom; PD000678; E2_N; 1. 

SQ SEQUENCE 388 AA; 44332 MW; 54422F4CD0613692 CRC64 ; 



Query Match 69.4%; Score 34; DB 1; Length 388; 

Best Local Similarity 75.0%; Pred. No. 17; 

Matches 6; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 



Qy 2 LSSRLDAC 9 

i-IIMI 

Db 4 LANRLDAC 11 



RESULT 8 
VE2_HPV2A 

ID VE2_HPV2A STANDARD; PRT; 3 91 AA. 

AC P25482; 

DT 01-MAY-1992 (Rel. 22, Created) 

DT 01-MAY-1992 (Rel. 22, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Regulatory protein E2 . 

GN E2. 

OS Human papillomavirus type 2a. 



OC Viruses; dsDNA viruses, no RNA stage; Papillomaviridae; 

OC Papillomavirus . 

OX NCBI_TaxID=10584 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-91188699; PubMed=l 964523 ; 

RA Hirsch-Behnam A., Delius H., de Villiers E.M.; 

RT "A comparative sequence analysis of two human papillomavirus (HPV) 

RT types 2a and 57."; 

RL Virus Res. 18:81-98(1990). 

CC -!- FUNCTION: E2 REGULATES VIRAL TRANSCRIPTION AND DNA REPLICATION. 

CC IT BINDS TO THE E2RE RESPONSE ELEMENT (5 ' -ACCNNNNNNGGT-3 ' ) PRESENT 

CC IN MULTIPLE COPIES IN THE REGULATORY REGION. IT CAN EITHER 

CC ACTIVATE OR REPRESS TRANSCRIPTION DEPENDING OF E2RE ' S POSITION 

CC WITH REGARDS TO PROXIMAL PROMOTER ELEMENTS. REPRESSION OCCURS 

CC BY STERICALLY HINDERING THE ASSEMBLY OF THE TRANSCRIPTION 

CC INITIATION COMPLEX. THE E1-E2 COMPLEX BINDS TO THE ORIGIN OF DNA 

CC REPLICATION. 

CC -!- SUBUNIT: Binds DNA as a dimer. 

CC -!- SUBCELLULAR LOCATION: Nuclear. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib .ch) . 

CC 

DR EMBL; X55964; -; NOT_ANNOTATED_CDS . 

DR PIR; S15617; S15617. 

DR HSSP; P17383; 1DHM . 

DR InterPro; IPR000427; E2_C. 

DR InterPro; IPR001866; E2_N. 

DR Pfam; PF00511; E2_C; 1. 

DR Pfam; PF00508; E2_N; 1. 

DR ProDom; PD000672; E2_C; 1. 

DR ProDom; PD000678; E2_N; 1. 

KW Early protein; Transcription regulation; Activator; DNA-binding; 

KW Trans-acting factor; DNA replication; Repressor; Nuclear protein. 

SQ SEQUENCE 391 AA; 43233 MW; 6F38 62 CD4A124B58 CRC64 ; 

Query Match 69.4%; Score 34; DB 1; Length 391; 

Best Local Similarity 75.0%; Pred. No. 17; 

Matches 6; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 
Qy 2 LSSRLDAC 9 

Db 4 LANRLDAC 11 



RESULT 9 
DBDR_XENLA 

ID DBDR_XENLA STANDARD; PRT; 457 AA. 

AC P42290; 

DT 01-NOV-1995 (Rel . 32, Created) 

DT 01-NOV-1995 (Rel. 32, Last sequence update) 



DT 15-JUL-1998 (Rel . 36, Last annotation update) 

DE D(1B) dopamine receptor (D(5) dopamine receptor) . 

OS Xenopus laevis (African clawed frog) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Amphibia; Batrachia; Anura; Mesobatrachia; Pipoidea; Pipidae; 

OC Xenopodinae; Xenopus. 

OX NCBI_TaxID=8355; 

RN [1] 

RP SEQUENCE FROM N . A. 

RX MEDLINE-95024150; PubMed=793 798 9 ; 

RA Sugamori K.S. # Demchyshyn L.L., Chung M . , Niznik H.B.; 

RT "D1A, DIB, and D1C dopamine receptors from Xenopus laevis."; 

RL Proc. Natl. Acad. Sci. U.S.A. 91:10536-10540(1994). 

CC -!- FUNCTION: THIS IS ONE OF THE FIVE TYPES (Dl TO D5) OF RECEPTORS 

CC FOR DOPAMINE. THE ACTIVITY OF THIS RECEPTOR IS MEDIATED BY G 

CC PROTEINS WHICH ACTIVATE ADENYLYL CYCLASE. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: BRAIN AND KIDNEY. 

CC -!- SIMILARITY: BELONGS TO FAMILY 1 OF G- PROTEIN COUPLED RECEPTORS 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch) . 

CC 

DR EMBL; U07864; AAA50829.1; -. 

DR PIR; 151660; 151660. 

DR InterPro; IPR000276; GPCR_Rhodpsn . 

DR Pfam; PF00001; 7tm_l; 1. 

DR PRINTS; PRO 023 7; GPCRRHODOPSN . 

DR PROSITE; PS00237; G__PROTEIN_RECEP_Fl_l ; 1. 

DR PROSITE; PS50262; G_PROTEIN_RECEP_Fl_2 ; 1. 

KW G-protein coupled receptor; Transmembrane; Glycoprotein; 

KW Multigene family; Lipoprotein; Palmitate. 



FT 


DOMAIN 


1 


41 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


42 


67 


1 (POTENTIAL) . 


FT 


DOMAIN 


68 


78 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


79 


105 


2 (POTENTIAL) . 


FT 


DOMAIN 


106 


114 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


115 


137 


3 (POTENTIAL) . 


FT 


DOMAIN 


138 


156 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


157 


181 


4 (POTENTIAL) . 


FT 


DOMAIN 


182 


205 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


206 


231 


5 (POTENTIAL) . 


FT 


DOMAIN 


232 


282 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


283 


309 


6 (POTENTIAL) . 


FT 


DOMAIN 


310 


326 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


327 


351 


7 (POTENTIAL) . 


FT 


DOMAIN 


352 


457 


CYTOPLASMIC (POTENTIAL) . 


FT 


CARBOHYD 


24 


24 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


DISULFID 


114 


199 


BY SIMILARITY. 


FT 


LIPID 


361 


361 


PALMITATE (BY SIMILARITY) . 


SQ 


SEQUENCE 


457 AA; 


51656 


MW; A0A389311E4CD2FB CRC64 ; 



Query Match 69.4%; Score 34; DB 1; Length 457; 

Best Local Similarity 55.6%; Pred. No, 20; 

Matches 5; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 



Qy 1 CLSSRLDAC 9 

I hhhl 
Db 254 CRSNRVDSC 2 62 

RESULT 10 
CARB_HUMAN 

ID CARB_HUMAN STANDARD; PRT; 1147 AA . 

AC Q9BXL7; 

DT 28-FEB-2003 (Rel . 41, Created) 

DT 28-FEB-2003 {Rel. 41, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Caspase recruitment domain protein 11 (CARD -containing MAGUK protein 

DE 3) (Carma 1) . 

GN CARD 11 OR CARMA 1 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCB I _Tax I D= 9 6 0 6 ; 

RN [1] 

RP SEQUENCE FROM N . A. 

RX MEDLINE-21192234; PubMed=11278 692 ; 

RA Bertin J. , Wang L., Guo Y., Jacobson M . D . , Poyet J.-L., 

RA Srinivasula S.M., Merriam S., DiStefano P.S., Alnemri E.S.; 

RT "CARD11 and CARD 14 are novel caspase recruitment domain 

RT (CARD) /membrane-associated guanylate kinase (MAGUK) family members 

RT that interact with BcllO and activate NF-kappaB."; 

RL J. Biol. Chem. 276:11877-11882(2001). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=21255663; PubMed=11356195 ; 

RA Gaide 0., Martinon F., Micheau 0., Bonnet D., Thome M. , Tschopp J. ; 

RT " Carma 1, a CARD-containing binding partner of BcllO, induces BcllO 

RT phosphorylation and NF-kappaB activation."; 

RL FEBS Lett. 496:121-127(2001). 

RN [3] 

RP ERRATUM . 

RA Gaide 0., Martinon F. , Micheau 0., Bonnet D., Thome M. , Tschopp J.; 

RL FEBS Lett. 505:198-198(2 001). 

CC -!- FUNCTION: Activates NF-kappaB via BcllO and IKK. Stimulates the 
CC phosphorylation of BcllO. 

CC -!- SUBUNIT: CARD11 and BcllO bind to each other by CARD-CARD 
CC interaction. 

CC -!- SUBCELLULAR LOCATION: Cytoplasmic. 

CC -!- TISSUE SPECIFICITY: Detected in adult peripheral blood leukocytes, 

CC thymus, spleen and liver. Also found in promyelocytic leukemia HL- 

CC 60 cells, chronic myelogenous leukemia K562 cells, Burkitt's 

CC lymphoma Ra j i cells and colorectal adenocarcinoma SW480 cells. Not 

CC detected in HeLa S3, Molt -4, A549 and G431 cells. 

CC -!- SIMILARITY: Contains 1 CARD domain. 

CC SIMILARITY: Contains 1 PDZ/DHR domain. 

CC -!- SIMILARITY: Contains 1 guanylate kinase-like domain. 

CC -!- CAUTION: Supposed to contain a SH3 domain which is not detected by 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 



PROSITE, Pfam or SMART. 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 



EMBL; AF322641; AAG53402.1; 
Genew; HGNC: 163 93; CARD11. 
MIM; 607210; - . 



GO; GO: 0005624 

GO; GO: 0004384 

GO; GO:0005515 

GO; GO: 0007250 



C: membrane fraction; NAS . 

F: membrane-associated guanylate kinase; NAS. 
Fiprotein binding activity; IPI. 
P: activation of NF-kappaB- inducing kinase; NAS. 
InterPro; IPR001315; CARD. 
InterPro; I PRO 00619; Guanylate_kin . 
InterPro; IPR001478; PDZ. 
SMART; SM00228; PDZ; 1. 
PROSITE; PS502 09; CARD; 1. 
PROSITE; PS00856; GUANYLATE__KINASE_1 ; 
PROSITE; PS50052; GUANYLATE__KINASE__2 ; 



FALSEJSTEG . 
FALSE NEG. 



DR 


PROSITE; 


PS50106; PDZ; FALSE 


_NEG. 


KW 


Coiled coil . 






FT 


DOMAIN 


11 


103 


CARD. 


FT 


DOMAIN 


123 


442 


COILED COIL (POTENTIAL) . 


FT 


DOMAIN 


673 


748 


PDZ . 


FT 


DOMAIN 


966 


1133 


GUANYLATE KINASE. 


FT 


CONFLICT 


808 


808 


P -> L (IN REF. 2) . 


SQ 


SEQUENCE 


1147 


AA; 132641 


MW; 913A4B015D2B36CC CRC64; 



Query Match 69.4%; 
Best Local Similarity 75.0%; 
Matches 6; Conservative 



Score 34; DB 1; 
Pred. No. 53; 
2 ; Mismatches 



Length 1147; 
0; Indels 



0 ; Gaps 



0; 



Qy 



Db 



2 LSSRLDAC 9 
777 ISSQLDAC 784 



RESULT 11 
PSPA_CAVPO 

ID PSPAJCAVPO STANDARD; PRT; 24 7 AA, 

AC P50403; 

DT 01-OCT-1996 (Rel. 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Pulmonary surfactant-associated protein A precursor (SP-A) (PSP-A) 
DE (PSAP) . 

GN SFTPA1 OR SFTPA OR SFTP1. 

OS Cavia porcellus (Guinea pig) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Hystricognathi ; Caviidae; Cavia. 

OX NCBI_TaxID=10141; 
RN [1] 



RP SEQUENCE FROM N.A. 

RC STRAIN=Hartley; TISSUE=Lung ; 

RX MEDLINE=98018900; PubMed=935 78 68 ; 

RA Yuan H.T., Gowan S., Kelly F.J., Bingle CD.; 

RT "Cloning of guinea pig surfactant protein A defines a distinct 

RT cellular distribution pattern within the lung."; 

RL Am. J. Physiol. 273 : L900-L906 ( 1997 ) . 

CC -!- FUNCTION: IN PRESENCE OF CALCIUM IONS, PSAP BINDS TO SURFACTANT 
CC PHOSPHOLIPIDS AND CONTRIBUTES TO LOWER THE SURFACE TENSION AT THE 

CC AIR-LIQUID INTERFACE IN THE ALVEOLI OF THE MAMMALIAN LUNG AND IS 

CC ESSENTIAL FOR NORMAL RESPIRATION. 

CC -!- SUBUNIT: OLIGOMERIC COMPLEX OF 6 SET OF HOMOTRIMERS . 

CC -!- SUBCELLULAR LOCATION: Extracellular. 

CC -!- MISCELLANEOUS: PULMONARY SURFACTANT CONSISTS OF 90% LIPID AND 10% 

CC PROTEIN. THERE ARE 4 SURFACTANT ASSOCIATED PROTEIN: 2 COLLAGENOUS, 

CC CARBOHYDRATE-BINDING GLYCOPROTEINS (SP-A AND SP-D) AND 2 SMALL 

CC HYDROPHOBIC PROTEINS (SP-B AND SP-C) . 

CC -!- SIMILARITY: Contains 1 collagenous domain. 

CC -!- SIMILARITY: Contains 1 C-type lectin family domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U40869; AAB82952.1; -. 

DR HSSP; P22897; 1EGG. 

DR InterPro; IPR000087; Collagen. 

DR InterPro; I PRO 013 04; Lectin__C. 

DR Pfam; PF013 91; Collagen; 1. 

DR Pfam; PF00059; lectin_c; 1. 

DR ProDom; PD000007; Clgjielix; 1. 

DR SMART; SM00034; CLECT; 1. 

DR PROSITE; PS00615; C_TYPE_LECTIN_1 ; 1. 

DR PROSITE; PS50041; C_TYPE_LECTIN_2 ; 1. 

KW Glycoprotein; Calcium; Surface film; Gaseous exchange; Hydroxylation; 

KW Signal; Lectin; Collagen; Repeat. 

FT SIGNAL 1 19 POTENTIAL. 

FT CHAIN 2 0 247 PULMONARY SURFACTANT -ASSOCIATED PROTEIN 

FT A. 

FT DOMAIN 27 99 COLLAGEN-LIKE . 

FT DOMAIN 152 245 C-TYPE LECTIN (SHORT FORM) . 

FT DISULFID 154 245 BY SIMILARITY. 

FT DISULFID 223 237 BY SIMILARITY. 

FT CARBOHYD 20 20 N-LINKED (GLCNAC . . .) (POTENTIAL). 

FT CARBOHYD 206 206 N-LINKED (GLCNAC . .) (POTENTIAL). 

SQ SEQUENCE 247 AA; 26104 MW; D1BC8 6270EEFC932 CRC64 ; 

Query Match 67.3%; Score 33; DB 1; Length 247; 

Best Local Similarity 66.7%; Pred. No. 17; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 CLSSRLDAC 9 

II III I 



Db 23 7 CLQSRLTIC 24 5 



RESULT 12 
GP7D_CHLTR 

ID GP7D_CHLTR STANDARD; PRT; 305 AA. 

AC P10561; Q46427; 

DT 01-JUL-1989 (Rel . 11, Created) 

DT 01-FEB-1996 (Rel. 33, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Virulence plasmid integrase pGP7-D (Protein P-ll) . 

OS Chlamydia trachomatis. 

OG Plasmid pLGV4 4 0 , Plasmid pCHLl, and Plasmid pCTTl . 

OC Bacteria; Chlamydiae; Chlamydiales ; Chlamydiaceae; Chlamydia. 

OX NCB I_TaxI D= 8 1 3 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-L2/434/BU; PLASMID=pLGV44 0 ; 

RX MEDLINE=89013895; PubMed=2 84 5228 ; 

RA Comanducci M. # Ricci S., Ratti G . ; 

RT "The structure of a plasmid of Chlamydia trachomatis believed to be 

RT required for growth within mammalian cells."; 

RL Mol. Microbiol. 2:531-538(1988). 

RN [2] 

RP SEQUENCE FROM N . A. 

RC STRAIN=Ll/44 0/LN; PLASM ID=pLGV44 0 ; 

RX MEDLINE=88233998; PubMed=2 83 68 08 ; 

RA Hatt C, Ward M.E., Clarke I.N.; 

RT "Analysis of the entire nucleotide sequence of the cryptic plasmid of 

RT Chlamydia trachomatis serovar LI. Evidence for involvement in DNA 

RT replication. " ; 

RL Nucleic Acids Res. 16:4053-4 067(1988). 

RN [3] 

RP REVISIONS. 

RC STRAIN-L1/44 0/LN; PLASMID=pLGV44 0 ; 

RA Hatt C. ; 

RL Submitted (APR-1994) to the EMBL/GenBank/DDB J databases. 

RN [4] 

RP SEQUENCE FROM N.A. 

RC STRAIN-G0/86 / Serotype D; PLASMID-pCHLl ; 

RX MEDLINE=90301796; PubMed=2 19422 9 ; 

RA Comanducci M . , Ricci S., Cevenini R., Ratti G. ; 

RT "Diversity of the Chlamydia trachomatis common plasmid in biovars 

RT with different pathogenicity."; 

RL Plasmid 23:149-154(1990). 

RN [5] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Serotype B; PLASMID=pCTTl ; 

RX MEDLINE=88177106; PubMed=344 48 59 ; 

RA Sriprakash K.S., Macavoy E.S.; 

RT "Characterization and sequence of a plasmid from the trachoma biovar 

RT of Chlamydia trachomatis."; 

RL Plasmid 18:205-214(1987). 

CC -!- MISCELLANEOUS: PGP7-D IS REQUIRED FOR GROWTH WITHIN MAMMALIAN 
CC CELLS . 

CC -!- MISCELLANEOUS: THE SEQUENCE SHOWN IS THAT OF PLASMID PLGV4 4 0 . 

CC -!- SIMILARITY: BELONGS TO THE "PHAGE" INTEGRASE FAMILY. 



CC -!- CAUTION: REF.l SEQUENCE DIFFERS FROM THAT SHOWN DUE TO A 
CC FRAMESHIFT . 

CC -!- CAUTION: REF . 5 SEQUENCE DIFFERS FROM THAT SHOWN DUE TO A 
CC FRAMESHIFT IN POSITION 254. 

cc 

CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib. ch) . 



EMBL; 
EMBL ; 
EMBL; 
EMBL; 
EMBL; 



X07547, 
X06707, 
J03321; 
M19487, 
M19487, 



CAA30427 . 
CAA29890, 
AAA91567. 
AAB02592 . 
AAB02584 , 
PIR; S01875; S01875 . 

Int erPro ; I PRO 02 1 04 ; Phage_int egrase . 
Pfam; PF00589; Phage_integrase ; 1. 



■ 1; 

■ 1; 
l; 
1; 

• 1; 



ALT FRAME . 



ALT__INIT. 
ALT FRAME . 



KW 


DNA recombination; 


DNA 


integration; Plasmid. 




FT 


ACT_SITE 


289 


289 


TRANSIENT COVALENT LINKAGE TO 


DNA DURING 


FT 








STRAND CLEAVAGE AND REJOINING 


(BY 


FT 








SIMILARITY) . 




FT 


VARIANT 


28 


28 


H -> Y (IN PLASMID PCHL1) . 




FT 


VARIANT 


244 


244 


Y -> H (IN PLASMIDS PCHL1 AND 


PCTT1) . 


FT 


VARIANT 


296 


296 


S -> I (IN PLASMID PCHL1) . 




FT 


VARIANT 


303 


303 


P -> T (IN PLASMIDS PCHL1 AND 


PCTT1) . 


SQ 


SEQUENCE 


305 AA; 


34E 


3 05 MW; 048C77FB84A42C19 CRC64 ; 





Query Match 67.3%; 
Best Local Similarity 66.7%; 
Matches 6; Conservative 



Score 33; DB 1; Length 3 05; 
Pred. No. 21; 
0; Mismatches 3; Indels 



0 ; Gaps 



0; 



Qy 
Db 



1 CLSSRLDAC 9 
279 CLSSRQSVC 2 87 



RESULT 13 
VE2_HPV61 

ID VE2_HPV61 STANDARD; PRT; 382 AA. 

AC Q80951; 

DT 15-JUL-1998 (Rel . 36, Created) 

DT 15-JUL-1998 (Rel. 36, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Regulatory protein E2 . 

GN E2 . 

OS Human papillomavirus type 61. 

OC Viruses; dsDNA viruses, no RNA stage; Papillomaviridae; 

OC Papillomavirus . 

OX NCBIJTaxID=3 7116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Delius H, ; 

RL Submitted (OCT-1995) to the EMBL/ GenBank/DDB J databases. 



CC -!- FUNCTION: E2 REGULATES VIRAL TRANSCRIPTION AND DNA REPLICATION. 

CC IT BINDS TO THE E2RE RESPONSE ELEMENT ( 5 ' - ACCNNNNNNGGT - 3 ' ) PRESENT 

CC IN MULTIPLE COPIES IN THE REGULATORY REGION. IT CAN EITHER 

CC ACTIVATE OR REPRESS TRANSCRIPTION DEPENDING OF E2RE'S POSITION 

CC WITH REGARDS TO PROXIMAL PROMOTER ELEMENTS. REPRESSION OCCURS 

CC BY STERICALLY HINDERING THE ASSEMBLY OF THE TRANSCRIPTION 

CC INITIATION COMPLEX. THE E1-E2 COMPLEX BINDS TO THE ORIGIN OF DNA 

CC REPLICATION. 

CC -!- SUBUNIT: Binds DNA as a dimer . 

CC -!- SUBCELLULAR LOCATION: Nuclear. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformat ics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to 1 icense@isb~sib . ch) . 

CC 

DR EMBL; U31793; AAA79495.1; -. 

DR HSSP; P17383; 1DHM . 

DR InterPro; IPR000427; E2_C. 

DR InterPro; IPR001866; E2_N. 

DR Pfam; PF00511; E2_C; 1. 

DR Pfam; PF00508; E2_N; 1. 

DR ProDom; PD000672; E2_C; 1. 

DR ProDom; PD000678; E2_N; 1. 

KW Early protein; Transcription regulation; Activator; DNA-binding; 

KW Trans-acting factor; DNA replication; Repressor; Nuclear protein. 

SQ SEQUENCE 382 AA; 43944 MW; 4 17F44 1DD7B772B4 CRC64 ; 

Query Match 67,3%; Score 33; DB 1; Length 382; 

Best Local Similarity 75.0%; Pred. No. 27; 

Matches 6; Conservative 1; Mismatches 1; Indels 0; Gaps 0; 
Qy 2 LSSRLDAC 9 

Db 6 LADRLDAC 13 



RESULT 14 
VE2_HPV32 

ID VE2_HPV32 STANDARD; PRT; 394 AA. 

AC P36791; 

DT 01-JUN-1994 (Rel . 29, Created) 

DT 01-JUN-1994 (Rel. 29, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Regulatory protein E2 . 

GN E2 . 

OS Human papillomavirus type 32. 

OC Viruses; dsDNA viruses, no RNA stage; Papillomaviridae; 

OC Papillomavirus . 

OX NCBI_TaxID=10612 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=94265501; PubMed-8205838 ; 

RA Delius H., Hofmann B.; 



RT "Primer-directed sequencing of human papillomavirus types."; 

RL Curr. Top. Microbiol. Immunol. 186:13-31(1994). 

CC -!- FUNCTION: E2 REGULATES VIRAL TRANSCRIPTION AND DNA REPLICATION . 

CC IT BINDS TO THE E2RE RESPONSE ELEMENT ( 5 ' -ACCNNNNNNGGT-3 ' ) PRESENT 

CC IN MULTIPLE COPIES IN THE REGULATORY REGION . IT CAN EITHER 

CC ACTIVATE OR REPRESS TRANSCRIPTION DEPENDING OF E2RE'S POSITION 

CC WITH REGARDS TO PROXIMAL PROMOTER ELEMENTS. REPRESSION OCCURS 

CC BY STERICALLY HINDERING THE ASSEMBLY OF THE TRANSCRIPTION 

CC INITIATION COMPLEX . THE E1-E2 COMPLEX BINDS TO THE ORIGIN OF DNA 

CC REPLICATION. 

CC -!- SUBUNIT: Binds DNA as a dimer. 

CC -!- SUBCELLULAR LOCATION: Nuclear. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch) . 



CC 

DR EMBL; X74475; CAA52552.1; -. 

DR PIR; S36512; S36512. 

DR HSSP; P17383; 1DHM. 

DR InterPrO; IPR000427; E2_C. 

DR InterPro; IPR001866; E2_N. 

DR Pfam; PF00511; E2_C; 1. 

DR Pfam; PF00508; E2_N; 1. 

DR ProDom; PD000672; E2_C; 1. 

DR ProDom; PD000678; E2_N; 1. 

KW Early protein; Transcription regulation; Activator; DNA-binding; 

KW Trans-acting factor; DNA replication; Repressor; Nuclear protein. 

SQ SEQUENCE 394 AA; 45038 MW; 113C46 119C22 65E7 CRC64; 

Query Match 67.3%; Score 33; DB 1; Length 394; 

Best Local Similarity 75.0%; Pred. No. 28; 

Matches 6; Conservative 1; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 LSSRLDAC 9 

h Mill 
Db 4 LAKRLDAC 11 



RESULT 15 
VE2_HPV42 

ID VE2_HPV42 STANDARD; PRT; 3 98 AA. 

AC P27223; 

DT 01-AUG-1992 (Rel . 23, Created) 

DT 01-AUG-1992 (Rel. 23, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Regulatory protein E2 . 

GN E2 . 

OS Human papillomavirus type 42. 

0C Viruses; dsDNA viruses, no RNA stage; Papillomaviridae ; 

OC Papillomavirus . 

OX NCBI_TaxID=10590; 

RN [1] 



RP SEQUENCE FROM N.A. 

RX MEDLINE=92087479; PubMed=13 09278 ; 

RA Philipp W., Honore N., Sapp M., Cole S.T., Streeck R.E.; 

RT "Human papillomavirus type 42: new sequences, conserved genome 

RT organization. "; 

RL Virology 186:331-334(1992). 

CC -!- FUNCTION: E2 REGULATES VIRAL TRANSCRIPTION AND DNA REPLICATION. 

CC IT BINDS TO THE E2RE RESPONSE ELEMENT (5 ' -ACCNNNNNNGGT-3 ' ) PRESENT 

CC IN MULTIPLE COPIES IN THE REGULATORY REGION. IT CAN EITHER 

CC ACTIVATE OR REPRESS TRANSCRIPTION DEPENDING OF E2RE ' S POSITION 

CC WITH REGARDS TO PROXIMAL PROMOTER ELEMENTS. REPRESSION OCCURS 

CC BY STERICALLY HINDERING THE ASSEMBLY OF THE TRANSCRIPTION 

CC INITIATION COMPLEX. THE E1-E2 COMPLEX BINDS TO THE ORIGIN OF DNA 

CC REPLICATION. 

CC -!- SUBUNIT: Binds DNA as a dimer. 

CC -!- SUBCELLULAR LOCATION: Nuclear. 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; M73236; AAA47044.1; ALT_INIT. 

DR PIR; B39451; W2WL42 . 

DR HSSP; PI 73 8 3; 1DHM. 

DR InterPro; IPR000427; E2_C. 

DR InterPro; IPR001866; E2_N. 

DR Pfam; PF0 0511; E2_C; 1. 

DR Pfam; PF00508; E2_N; 1. 

DR ProDom; PD000672; E2_C; 1. 

DR ProDom; PD000678; E2_N; 1. 

KW Early protein; Transcription regulation; Activator; DNA-binding; 

KW Trans-acting factor; DNA replication; Repressor; Nuclear protein. 

SQ SEQUENCE 398 AA; 45309 MW; 4D41D71963 728 08C CRC64; 

Query Match 67.3%; Score 33; DB 1; Length 398; 
Best Local Similarity 75.0%; Pred. No. 28; 

Matches 6; Conservative 1; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 LSSRLDAC 9 

h Mill 

Db 4 LAKRLDAC 11 

Search completed: November 13, 2003, 09:46:32 

Job time : 6.15625 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on : 



November 13, 2003, 09:31:40 ; Search time 23.7188 Seconds 



(without alignments) 

97.917 Million cell updates/sec 

Title: US- 09 -228 -8 66-3 

Perfect score: 4 9 

Sequence: 1 CLSSRLDAC 9 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 830525 seqs, 258052604 residues 

Total number of hits satisfying chosen parameters: 830525 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Da t aba s e : S PTREMBL_2 3 : * 

1 : sp_archea : * 

2: spjoacteria: * 

3 : sp_f ungi : * 

4 : sp_human : * 

5 : sp_invertebrate : * 

6: sp_mammal:* 

7 : sp_mhc : * 

8 : sp_organel 1 e : * 

9 : sp_phage : * 

10: sp_plant:* 

1 1 : sp_rodent : * 

12: Sp_viruS:* 

13 : sp_vertebrate: * 

14 : sp_unclassif ied: * 

15: sp_rvirus:* 

16 : sp_bacteriap : * 

17: sp_archeap:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

o. 

Result Query 

No. Score Match Length DB ID Description 



1 


41 


83 


.7 


935 


5 


Q94719 


Q94719 Paramecium 


2 


41 


83 


.7 


2533 


5 


P90589 


P90589 Paramecium 


3 


41 


83 


,1 


2533 


5 


Q27183 


Q27183 Paramecium 


4 


41 


83 


1 


2543 


5 


P90649 


P90649 Paramecium 


5 


37 


75 


5 


92 


16 


Q9PDV6 


Q9pdv6 xylella fas 


6 


37 


75 


5 


152 


16 


Q9K6K3 


Q9k6k3 bacillus ha 


7 


35 


71 


4 


115 


4 


Q8TCB4 


Q8tcb4 homo sapien 


8 


35 


71 


4 


153 


16 


Q930G5 


Q93 0g5 rhizobium m 



Q 

y 


"3 C 
J D 


H 1 
/ 1 


. 4 


161 


5 


Q9XV60 


Q9xv60 caenorhabdi 


1 Pi 


35 


71 


♦ 4 


251 


12 


Q8B3U7 


Q8b3u7 porcine lym 


X 1 


o 5 


/ 1 


. 4 


"> tr 0 

zoo 


17 


Q9YC60 


Q9yc60 aeropyrum p 


1 o 

xz 


J 5 


1 l 


. 4 


389 


16 


Q9AAC8 


Q9aac8 caulobacter 




34 


69 


. 4 


157 


10 


Q9LE05 


Q91e05 medicago sa 


1 A 

14 


34 


69 


. 4 


185 


6 


062685 


062685 saimiri sci 


Id 


34 


69 


. 4 


251 


12 


Q8 JYB7 


Q8jyb7 porcine lym 


i a 
16 


34 


69 


. 4 


375 


12 


Q9DIH4 


Q9dih4 human papil 


1 / 


1 A 

34 


69 


. 4 


4 94 


16 


Q8ZET7 


Q8zet7 yersinia pe 


1 o 

lo 


34 


69 


. 4 


529 


16 


Q8D0F3 


Q8d0f3 yersinia pe 


1 Q 
1 J 


6 4 


6 y 


, 4 


54 9 


11 


Q83YV0 


Q8byv0 mus musculu 


Z \J 


o 4 


o y 


. 4 


/ol 




yy9i3i 


Q99131 ustilago ma 


0 1 
Z _L 


o4 


b y 


. 4 


Oil 

y lj 


13 


Q8AY18 


Q8ayl8 rana escule 


z z 


O A 

34 


6 9 


. 4 


1159 


11 


Q8CIS0 


Q8cis0 mus musculu 


z ~> 




b y 


. 4 


11/1 


A 

4 


lhS3 


Q8tes3 homo sapien 


z4 


34 


69 . 


. 4 


1650 


5 


Q8I2T7 


Q8i2t7 Plasmodium 


Z 0 


34 


69 . 


. 4 


1759 


5 


Q9XTP8 


Q9xtp8 Plasmodium 


Z b 


34 


69 , 


. 4 


3306 


10 


Q9FT44 


Q9ft44 arabidopsis 


z / 


33 . 5 


68 . 


. 4 


700 


11 


Q9DBD0 


Q9dbd0 mus musculu 


O Q 
Z O 


33.5 


68 , 


. 4 


700 


11 


Q8VC96 


Q8vc96 mus musculu 


z y 


33 


67 , 


- 3 


115 


16 


Q9RIZ0 


Q9riz0 streptomyce 


J U 


J J 


67 . 


. 3 


140 


2 


Q47504 


Q47504 escherichia 


J 1 


33 


67 , 


-> 

, 3 


162 


5 


Q9UA34 


Q9ua34 ostrinia nu 


3z 


33 


67 . 


, 3 


162 


5 


Q9UA33 


Q9ua33 ostrinia nu 


33 


33 


67 . 


, 3 


162 


5 


Q9Y1I1 


Q9ylil ostrinia fu 


J4 


33 


67 . 


3 


162 


5 


Q9TW42 


Q9tw42 ostrinia nu 


o 3 


J J 


b / . 


,5 


Ibz 


5 


Q9TVH3 


Q9tvh3 ostrinia nu 


Jj D 




b / . 




1 b-^ 


T- 

O 


yy y iriy 


Q9ylh9 ostrinia nu 


J5 / 


33 


67 . 


3 


162 


5 


Q9TW57 


Q9tw57 ostrinia nu 


3 O 


33 


67 . 


3 


162 


5 


Q9TVG5 


Q9tvg5 ostrinia nu 


j y 




r ""7 

b / . 


3 


162 


5 


Q9Y1I 0 


Q9yli0 ostrinia fu 


40 


33 


67. 


3 


197 


10 


Q8RV58 


Q8rv58 oryza sativ 


41 


33 


67. 


3 


264 


12 


Q9DW32 


Q9dw32 rat cytomeg 


42 


33 


67. 


3 


280 


17 


Q9YCP1 


Q9ycpl aeropyrum p 


43 


33 


67. 


3 


289 


5 


Q8IG38 


Q8ig38 caenorhabdi 


44 


33 


67. 


3 


306 


5 


Q18896 


Q18 8 96 caenorhabdi 


45 


33 


67. 


3 


311 


11 


Q8R335 


Q8r335 mus musculu 



ALIGNMENTS 



RESULT 1 
Q94719 

ID Q94719 PRELIMINARY; PRT; 935 AA. 

AC Q94719; 

DT 01-FEB-1997 (TrEMBLrel . 02, Created) 

DT 01-FEB-1997 (TrEMBLrel. 02, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Epsilon 51-D i-ag (Fragment) . 

GN EPSIL0N-51D . 

OS Paramecium tetraurelia. 

OC Eukaryota; Alveolata; Ciliophora; Oligohymenophorea; Peniculida; 

OC Paramecium. 

OX NCBI_TaxID-5888 ; 

RN [1] 

RP SEQUENCE FROM N.A. 



RC STRAIN=51; 

RA Schwegmann K. , Schulte G., Schmidt H.; 

RL Submitted (MAR-1996) to the EMBL/ GenBank/DDBJ databases, 

DR EMBL; X96557; CAA65393.1; 

DR InterPro; IPR002895; ParameciumjSA . 

DR Pfam; PF01508; Paramecium_SA; 10. 

DR SMART; SM00639; PSA; 11. 

FT NON_TER 935 935 

SQ SEQUENCE 935 AA; 97925 MW; 2D75D0 9F2 8B44 903 CRC64; 



Query Match 83.7%; 
Best Local Similarity 66.7%; 
Matches 6; Conservative 



Score 41; DB 5; 
Pred . No . 4.5; 
3 ; Mismatches 



Length 935; 



0; Indels 



0; Gaps 



0; 



Qy 

Db 



1 CLSSRLDAC 9 
575 CISNRVDAC 583 



RESULT 2 
P90589 

ID P90589 PRELIMINARY; PRT; 2533 AA. 

AC P90589; 

DT 01-MAY-1997 (TrEMBLrel . 03, Created) 

DT 01-MAY-1997 (TrEMBLrel . 03, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel . 23, Last annotation update) 

DE Alpha-51D immobilization antigen. 

GN ALPHA-51D . 

OS Paramecium tetraurelia. 

OC Eukaryota; Alveolata; Ciliophora; Oligohymenophorea ; Peniculida; 

OC Paramecium, 

OX NCB IJTax I D= 5 8 8 8 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-51; 

RA Schwegmann K. , Klein H. , Schmidt H.; 

RL Submitted (MAR-1996) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; X96400; CAA65264.1; 

DR InterPro; IPR002 8 95; Paramecium_SA . 

DR Pfam; PF01508; Paramecium_SA; 21. 

DR SMART; SM00639; PSA; 26. 

SQ SEQUENCE 2533 AA; 264142 MW; EAED7F21E408C371 CRC64; 

Query Match 83.7%; Score 41; DB 5; Length 2533; 

Best Local Similarity 66.7%; Pred. No. 12; 

Matches 6; Conservative 3; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CLSSRLDAC 9 

hhhlll 

Db 575 CISNRVDAC 583 



RESULT 3 
Q27183 

ID Q27183 PRELIMINARY; PRT ; 2533 AA. 

AC Q27183; 

DT 01-NOV-1996 (TrEMBLrel. 01, Created) 



DT 01-NOV-1996 (TrEMBLrel . 01, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel . 23, Last annotation update) 

DE Alpha-51D-immobilization antigen. 

GN ALPHA- 51D-GENE. 

OS Paramecium tetraurelia. 

OC Eukaryota; Alveolata; Ciliophora; Oligohymenophorea ; Peniculida; 

OC Paramecium. 

OX NCBI_TaxID=5888; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-51; 

RA Schmidt H.J. ; 

RL Submitted (MAR-1995) to the EMBL / GenBank / DDB J databases. 

DR EMBL; X85135; CAA59447.1; -. 

DR InterPro; IPR002895; ParameciutTi_SA . 

DR Pfam; PF01508; Paramecium_SA; 22. 

DR SMART; SM00639; PSA; 26. 

SQ SEQUENCE 2533 AA; 263996 MW; 261BD09806BC344D CRC64 ; 

Query Match 83.7%; Score 41; DB 5; Length 2533; 

Best Local Similarity 66.7%; Pred. No. 12; 

Matches 6 ; Conservative 3 ; Mismatches 0 ; Indels 0 ; Gaps 

Qy 1 CLSSRLDAC 9 

|=|=|=||| 
Db 575 CISNRVDAC 583 



RESULT 
P90649 



ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
OX 
RN 
RP 
RC 
RX 
RA 
RT 
RT 
RL 
DR 
DR 
DR 
DR 
SQ 



Created) 

Last sequence update) 
Last annotation update) 



P90649 PRELIMINARY; PRT; 2543 AA. 

P90649; 

01-MAY-1997 (TrEMBLrel. 03, 
01-MAY-1997 (TrEMBLrel. 03, 
01-MAR-2003 (TrEMBLrel. 23, 
156D suface antigen. 
Paramecium primaurelia. 

Eukaryota; Alveolata; Ciliophora; Oligohymenophorea; Peniculida; 
Paramecium. 
NCBI_TaxID=5886; 
[1] 

SEQUENCE FROM N.A. 
STRAIN=156; 

MEDLINE=96313351; PubMed=8 768434 ; 
Bourgain-Guglielmetti F. , Caron; 
"Molecular characterization of the 
Paramecium primaurelia . " ; 
J. Eukaryot. Microbiol. 43:3 03-314 
EMBL; X96616; CAA65436.1; -. 
I nt erPro ; I PRO 02 8 9 5 ; Paramec ium_SA . 
Pfam; PF01508; Paramec ium_SA; 20. 
SMART; SM0063 9; PSA; 25. 

SEQUENCE 2543 AA; 267041 MW; 82 8EF797CB012 902 CRC64 ; 



D surface protein gene subfamily in 
1996) . 



Query Match 83.7%; Score 41; DB 5; Length 2543; 

Best Local Similarity 66.7%; Pred. No. 12; 

Matches 6; Conservative 3; Mismatches 0; Indels 0; Gaps 



Qy 1 CLSSRLDAC 9 

Db 575 CISNRVDAC 583 

RESULT 5 
Q9PDV6 

ID Q9PDV6 PRELIMINARY; PRT; 92 AA. 

AC Q9PDV6; 

DT Ol-OCT-2000 (TrEMBLrel . 15, Created) 

DT Ol-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-MAR-2002 (TrEMBLrel . 20, Last annotation update) 

DE Hypothetical protein Xfl273. 

GN XF1273. 

OS Xylella fastidiosa. 

OC Bacteria; Proteobacteria ; Gammaproteobacteria ; Xanthomonadales ; 

OC Xanthomonadaceae; Xylella . 

OX NCBI_TaxID=2371; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=9a5c; 

RX MEDLINE=20365717; PubMed=1091034 7 ; 

RA Simpson A.J.G., Reinach F.C., Arruda P., Abreu F.A., Acencio M . , 

RA Alvarenga R., Alves L.M.C., Araya J.E., Baia G.S., Baptista C.S., 

RA Barros M.H., Bonaccorsi E.D., Bordin S., Bove J.M. , Briones M.R.S., 

RA Bueno M.R.P., Camargo A. A. , Camargo L.E.A., Carraro D.M. , Carrer H., 

RA Colauto N.B., Colombo O, Costa F.F., Costa M.C.R., Costa-Neto CM., 

RA Coutinho L.L., Cristofani M. , Dias-Neto E., Docena C, El-Dorry H. , 

RA Facincani A. P., Ferreira A.J.S., Ferreira V.C.A., Ferro J.A. , 

RA Fraga J.S., Franca S.C., Franco M.C., Frohme M. , Furlan L.R. , 

RA Gamier M. , Goldman G.H., Goldman M.H.S., Gomes S.L., Gruber A., 

RA Ho P.L., Hoheisel J.D., Junqueira M.L., Kemper E.L., Kitaj ima J. P., 

RA Krieger J.E., Kuramae E.E., Laigret F., Lambais M.R., Leite L.C.C., 

RA Lemos E.G.M., Lemos M.V.F., Lopes S.A., Lopes C.R., Machado J.A. , 

RA Machado M.A. , Madeira A.M.B.N., Madeira H.M.F., Marino C.L. , 

RA Marques M.V. , Martins E.A.L., Martins E.M.F., Matsukuma A.Y., 

RA Menck C.F.M., Miracca E.C., Miyaki C.Y., Monteiro-Vitorello C.B., 

RA Moon D.H., Nagai M.A., Nascimento A. L . T . O . , Netto L.E.S., 

RA Nhani A. Jr., Nobrega F.G., Nunes L.R., Oliveira M.A., 

RA de Oliveira M.C., de Oliveira R.C., Palmieri D.A. , Paris A., 

RA Peixoto B.R., Pereira G.A.G. , Pereira H.A. Jr., Pesquero J.B., 

RA Quaggio R.B., Roberto P.G., Rodrigues V., de Rosa A.J.M., 

RA de Rosa V.E. Jr., de Sa R.G., Santelli R,V. , Sawasaki H.E., 

RA da Silva A.C.R., da Silva A.M., da Silva F.R. , Silva W.A. Jr., 

RA da Silveira J.F., Silvestri M.L.Z., Siqueira W.J., de Souza A. A. , 

RA de Souza A. P., Terenzi M.F., Truffi D., Tsai S.M., Tsuhako M.H., 

RA Vallada H., Van Sluys M.A. , Verjovski -Almeida S., Vettore A.L., 

RA Zago M.A., Zatz M. , Meidanis J., Setubal J.C.; 

RT "The genome sequence of the plant pathogen Xylella fastidiosa."; 

RL Nature 406:151-159(2000). 

DR EMBL; AE003961; AAF84082.1; -. 

KW Hypothetical protein; Complete proteome. 

SQ SEQUENCE 92 AA; 9993 MW; 763E024E2E909A42 CRC64 ; 



Query Match 75.5%; Score 37; DB 16; Length 92; 

Best Local Similarity 8 7.5%; Pred. No. 3.3; 



Matches 



7; Conservative 1; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 CLSSRLDA 8 

Db 67 CLASRLDA 74 

RESULT 6 
Q9K6K3 

ID Q9K6K3 PRELIMINARY; PRT; 152 AA. 

AC Q9K6K3; 

DT 01-OCT-2000 (TrEMBLrel . 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-JUN-2002 (TrEMBLrel. 21, Last annotation update) 

DE Hypothetical protein BH3726. 

GN BH3726. 

OS Bacillus halodurans. 

OC Bacteria; Firmicutes; Bacillales; Bacillaceae; Bacillus. 

OX NCBI_TaxID=86665; 

RN [1] 

RP SEQUENCE FROM N . A . 

RC STRAIN=C-125 / JCM 9153; 

RX MEDLINE=20512582; PubMed=l 1058 132 ; 

RA Takami H. , Nakasone K. , Takaki Y. , Maeno G . , Sasaki R. , Masui N. , 

RA Fuji P., Hirama C. , Nakamura Y. , Ogasawara N., Kuhara S., 

RA Horikoshi K. ; 

RT "Complete genome sequence of the alkaliphilic bacterium Bacillus 

RT halodurans and genomic sequence comparison with Bacillus subtilis."; 

RL Nucleic Acids Res. 28:4317-4331(2000). 

DR EMBL; AP001519; BAB07445.1; 

DR InterPro; IPR000086; NUDIX_hydrolase . 

DR Pfam; PF002 93; NUDIX; 1. 

KW Hypothetical protein; Complete proteome. 

SQ SEQUENCE 152 AA; 1768 8 MW; 2C3A461F9CEDA6DD CRC64; 

Query Match 75.5%; Score 37; DB 16; Length 152; 

Best Local Similarity 77.8%; Pred. No. 5.4; 

Matches 7; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 CLSSRLDAC 9 

II III II 
Db 127 CLPSRLKAC 135 

RESULT 7 
Q8TCB4 

ID Q8TCB4 PRELIMINARY; PRT; 115 AA. 

AC Q8TCB4; 

DT 01-JUN-2002 (TrEMBLrel. 21, Created) 

DT 01-JUN-2002 (TrEMBLrel. 21, Last sequence update) 

DT 01-OCT-2002 (TrEMBLrel . 22, Last annotation update) 

DE Hypothetical protein (Fragment) . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606 ; 

RN [1] 



RP SEQUENCE FROM N.A. 

RC TISSUE=Lung ; 

RA Strausberg R. ; 

RL Submitted (FEB-2002) to the EMBL/GenBank / DDB J databases . 

DR EMBL; BC022404; AAH22404.1; 

KW Hypothetical protein. 

FT NONJTER 1 1 

SQ SEQUENCE 115 AA; 12461 MW; 4E1BDD511025F61A CRC64; 

Query Match 71.4%; Score 35; DB 4; Length 115; 

Best Local Similarity 66.7%; Pred. No. 11; 

Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 

Qy 1 CLSSRLDAC 9 

III hll 
Db 104 CLSQALEAC 112 



RESULT 8 
Q930G5 
ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OG 
OC 
OC 



Created) 

Last sequence update) 
Last annotation update) 



Q93 0G5 PRELIMINARY; PRT; 153 AA. 

Q93 0G5; 

01-DEC-2001 (TrEMBLrel. 19, 
01-DEC-2001 (TrEMBLrel. 19, 
01-MAR-2002 (TrEMBLrel. 20, 
Hypothetical protein RA0231. 
RA0231 OR SMA0443 . 

Rhizobium meliloti (Sinorhizobium meliloti) . 
Plasmid pSymA (megaplasmid 1) . 

Bacteria; Proteobacteria ; Alphaproteobacteria; Rhizobiales; 
Rhizobiaceae; Sinorhizobium. 
OX NCBIJTaxID=382; 
RN [1] 

RP SEQUENCE FROM N.A. 
RC STRAIN=1021; 

RX MEDLINE=21396509; PubMed=11481432; 

RA Barnett M.J., Fisher R.F., Jones T., Komp C, Abola 
RA Barloy-Hubler P., Bowser L. , Capela D. , Galibert F. , 
RA Gurjal M., Hong A., Huizar L. , Hyman R.W. , Kahn D. , 
RA Kalman S., Keating D.H., Palm C. , Peck M.C., Surzycki R. , Wells D.H. 
RA Yeh K.-C, Davis R.W., Federspiel N.A. , Long S.R.; 
RT "Nucleotide sequence and predicted functions of the entire 
RT Sinorhizobium meliloti pSymA megaplasmid."; 
RL Proc. Natl. Acad. Sci. U.S.A. 98:9883-9888(2001). 
DR EMBL; AE007216; AAK64889.1; 

KW Hypothetical protein; Plasmid; Complete proteome. 

SQ SEQUENCE 153 AA; 17025 MW; 0715B1AEF2C6FA04 CRC64; 



A. P. , 

, Gouzy J . , 
Kahn M . L . , 



Query Match 71.4%; 
Best Local Similarity 77.8%; 
Matches 7; Conservative 

Qy 1 CLSSRLDAC 9 



Score 35; DB 
Pred. No. 14; 
0; Mismatches 



16; Length 153; 
2; Indels 0; 



Gaps 



0; 



Db 



13 8 CLPSRLMAC 14 6 



RESULT 
Q9XV60 
ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 



Created) 

Last sequence update) 
Last annotation update) 



Q9XV60 PRELIMINARY; PRT; 161 AA. 

Q9XV60; 

01 -NOV- 19 99 (TrEMBLrel . 12, 
01-NOV-1999 (TrEMBLrel. 12, 
01-MAR-2003 (TrEMBLrel. 23, 
F26D2.12 protein. 
F26D2 . 12 . 

Caenorhabditis elegans . 
OC Eukaryota; Metazoa; Nematoda; Chroma do r ea ; Rhabditida; Rhabditoidea ; 
OC Rhabditidae; Peloderinae; Caenorhabditis. 
OX NCBI_TaxID=6239; 
RN [1] 

RP SEQUENCE FROM N.A. 
RA McMurray A. A. ; 

RL Submitted (NOV-1996) to the EMBL/GenBank/DDBJ databases. 
RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=99069613; PubMed=9851916 ; 
RA none ; 

RT "Genome sequence of the nematode C. elegans : A platform for 

RT investigating biology."; 

RL Science 282:2012-2018(1998). 

DR EMBL ; Z81513; CAB04182.1; -. 

DR HSSP; P0514 0; 2AFP . 

DR WormPep; F26D2.12; CE18642. 

DR InterPro; IPR001304; Lectin_C. 

DR Pfam; PF0005 9; lectin_C; 1. 

DR SMART; SM0 0 034; CLECT; 1. 

DR PROSITE; PS50041; C_TYPE_LECTIN_2 ; 1. 

SQ SEQUENCE 161 AA; 16987 MW; B54C850333E98 0 12 CRC64; 



Query Match 71.4%; 
Best Local Similarity 87.5%; 
Matches 7; Conservative 



Score 35; DB 5; 
Pred. No. 15; 
0; Mismatches 



Length 161; 
1; Indels 



0 ; Gaps 



0; 



Qy 



Db 



1 CLSSRLDA 8 

I I I I Ml 
101 CLSSNLDA 108 



RESULT 
Q8B3U7 



10 



ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
OX 
RN 
RP 
RC 



Created) 

Last sequence update) 
Last annotation update) 



Q8B3U7 PRELIMINARY; PRT; 251 AA. 

Q8B3U7; 

01-MAR-2003 (TrEMBLrel. 23, 
01-MAR-2003 (TrEMBLrel. 23, 
01-MAR-2003 (TrEMBLrel. 23, 
Uracil -DNA-glycosidase . 
Porcine lymphotropic herpesvirus 2. 

Viruses; dsDNA viruses, no RNA stage; Herpesviridae; 

Gammaherpesvirinae . 

NCBI_TaxID=91741; 

[1] 

SEQUENCE FROM N.A. 
STRAIN=568; 



RA Chmielewicz B . , Goltz M. , Franz T. , Bauer C. , Brema S., Ellerbrok H. , 

RA Beckmann S., Rziha H.-J., Lahrmann K.-H., Romero C. , Ehlers B . ; 

RT "A novel porcine gammaherpesvirus . " ; 

RL Virology 0:0-0(2003). 

DR EMBL ; AY170317; AA012387.1; -. 

KW Glycosidase. 

SQ SEQUENCE 251 AA; 28400 MW; F3 645 828 14E84F8B CRC64 ; 

Query Match 71.4%; Score 35; DB 12; Length 251; 

Best Local Similarity 55.6%; Pred. No. 23; 

Matches 5; Conservative 4; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CLSSRLDAC 9 

Db 172 CLSNKLNSC 180 



RESULT 11 
Q9YC60 

ID Q9YC60 PRELIMINARY; PRT; 258 AA. 

AC Q9YC60; 

DT 01-NOV-1999 (TrEMBLrel . 12, Created) 

DT 01-NOV-1999 (TrEMBLrel. 12, Last sequence update) 

DT 01-MAR-2002 (TrEMBLrel. 20, Last annotation update) 

DE Hypothetical protein APE1391. 

GN APE13 91. 

OS Aeropyrum pernix. 

OC Archaea; Crenarchaeota ; Thermoprotei ; Desulfurococcales ; 

OC Desulfurococcaceae; Aeropyrum. 

OX NCBI_TaxID=56636; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=K1 ; 

RX MEDLINE=99310339; PubMed=103 82 96 6 ; 

RA Kawarabayasi Y. , Hino Y. , Horikawa H., Yamazaki S., Haikawa Y. , 

RA Jin-no K. , Takahashi M. , Sekine M. , Baba S.-I., Ankai A., Kosugi H., 

RA Hosoyama A., Fukui S., Nagai Y., Nishijima K. , Nakazawa H. , 

RA Takamiya M., Masuda S., Funahashi T. , Tanaka T., Kudoh Y. , 

RA Yamazaki J., Kushida N . , Oguchi A. , Aoki K.-I., Kubota K. , 

RA Nakamura Y., Nomura N. , Sako Y, # Kikuchi H.; 

RT "Complete genome sequence of an aerobic hyper- thermophilic 

RT crenarchaeon, Aeropyrum pernix Kl . " ; 

RL DNA Res. 6:83-101(1999). 

DR EMBL; AP000061; BAA80388.1; -. 

KW Hypothetical protein; Complete proteome. 

SQ SEQUENCE 258 AA; 28713 MW; 2 5C2FDB73F178AEB CRC64 ; 

Query Match 71.4%; Score 35; DB 17; Length 258; 

Best Local Similarity 66.7%; Pred. No. 24; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 CLSSRLDAC 9 

III I I I 
Db 223 CLSGRLSTC 231 



RESULT 12 



Q9AAC8 

ID Q9AAC8 PRELIMINARY; PRT; 389 AA. 

AC Q9AAC8 ; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 {TrEMBLrel. 17, Last sequence update) 

DT 01-MAR-2003 {TrEMBLrel. 23, Last annotation update) 

DE Hypothetical protein CC0674 . 

GN CC0674. 

OS Caulobacter crescentus . 

OC Bacteria ; Prot eobacteria ; Alphaproteobacteria ; Caulobact erales ; 

OC Caulobact eraceae; Caulobacter. 

OX NCBI_TaxID==1558 92; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=ATCC 19089 / CB15; 

RX MEDLINE=21173698 ; PubMed=11259647 ; 

RA Nierman W.C., Feldblyum T.V., Laub M.T., Paulsen I.T., Nelson K.E., 

RA Eisen J., Heidelberg J.F., Alley M.R.K., Ohta N. , Maddock J.R., 

RA Potocka I., Nelson W.C., Newton A., Stephens C. , Phadke N.D. , Ely B. 

RA DeBoy R.T. , Dodson R.J., Durkin A.S., Gwinn M.L., Haft D.H., 

RA Kolonay J.F., Smit J., Craven M.B. , Khouri H. , Shetty J., Berry K. , 

RA Utterback T. , Tran K. , Wolf A., Vamathevan J., Ermolaeva M. , White 0 

RA Salzberg S.L., Venter J.C., Shapiro L. , Fraser CM.; 

RT "Complete genome sequence of Caulobacter crescentus ." ; 

RL Proc. Natl. Acad. Sci . U.S.A. 98:4136-4141(2001). 

DR EMBL; AE005743; AAK22659.1; -. 

DR TIGR; CC0674; -. 

DR InterPro; IPR000734; Lipase. 

DR InterPro; IPR00 0897; SRP54 . 

DR PROSITE; PS00120; LIPASE_SER; 1. 

DR PROSITE; PS003 00; SRP54; 1. 

KW Hypothetical protein; Complete proteome. 

SQ SEQUENCE 389 AA; 40769 MW; C6DD05B8CE8D150E CRC64; 

Query Match 71.4%; Score 35; DB 16; Length 389; 

Best Local Similarity 66.7%; Pred. No. 36; 

Matches 6; Conservative 0; Mismatches 3; Indels 0; Gaps 
Qy 1 CLSSRLDAC 9 

Db 44 CLPGRADAC 52 



RESULT 13 
Q9LE05 

ID Q9LE05 PRELIMINARY; PRT; 157 AA. 

AC Q9LE05; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last annotation update) 

DE Resistance gene analog protein (Fragment) . 

OS Medicago sativa (Alfalfa) . 

OC Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta; 

OC Spermatophyta; Magnol iophyta ; eudicotyledons ; core eudicots; Rosidae 

OC eurosids I; Fabales; Fabaceae; Papilionoideae; Trifolieae; Medicago. 

OX NCBI_TaxID=3879; 

RN [1] 



RP SEQUENCE FROM N.A. 

RC STRAIN=cv. UC123; 

RA Cordero J.C., Skinner D.Z.; 

RT "Isolation of the Nucleotide Binding Site Family of Resistance Gene 

RT Analogs from Alfalfa, Medicago Sativa." ; 

RL Submitted (FEB-2000) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AF230816; AAF72899.1; 

DR EMBL; AF23 0814; AAF728 97.1; 

DR InterPro; IPR002182; NB-ARC. 

DR Pfam; PF00931; NB-ARC; 1. 

FT NON_TER 1 1 

FT NON_TER 157 157 

SQ SEQUENCE 157 AA; 17959 MW; 3 14D07D288F5A2A2 CRC64; 



Query Match 69.4%; 
Best Local Similarity 75.0%; 
Matches 6; Conservative 



Score 34; DB 10; Length 157; 
Pred. No. 24; 
1; Mismatches 1; Indels 



0; Gaps 



Qy 

Db 



2 LSSRLDAC 9 

:||| III 
9 ISSRFDAC 16 



RESULT 
062685 
ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 



14 



OC 
OX 
RN 
RP 
RX 
RA 
RA 
RT 
RT 
RT 
RL 
DR 
DR 
DR 
DR 
DR 
FT 
SQ 



PRELIMINARY; 



PRT; 



185 AA. 



Created) 

Last sequence update) 
Last annotation update) 



062685 
062685; 

01-AUG-1998 (TrEMBLrel . 07, 
01-AUG-1998 (TrEMBLrel. 07, 
01-DEC-2001 (TrEMBLrel. 19, 
CD4 6 protein (Fragment) . 

Saimiri sciureus (Common squirrel monkey) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Platyrrhini; Cebidae; Cebinae; Saimiri. 
NCBI_TaxID=9521; 
[1] 

SEQUENCE FROM N.A. 

MEDLINE=98184523; PubMed=952 56 1 1 ; 

Hsu E.C., Sarangi F., Iorio C. , Sidhu M.S., Udem S.A., Dillehay D.L., 
Xu W., Rota P. A., Bellini W.J., Richardson C . D . ; 

"A single amino acid change in the hemagglutinin protein of measles 

virus determines its ability to bind CD46 and reveals another receptor 

on marmoset B cells."; 

J. Virol. 72:2905-2916(1998). 

EMBL; AF025483; AAC39671.1; -. 

HSSP; P10998; 1 WD . 

InterPro ; IPR00 0436; Sushi_SCR_CCP . 
Pfam; PF00084; sushi; 2. 
SMART; SM00032; CCP; 2. 
N0N__TER 185 185 

SEQUENCE 185 AA; 20733 MW; A7CF613A0DF23742 CRC64; 



Query Match 69.4%; 
Best Local Similarity 87.5%; 
Matches 7; Conservative 



Score 34; DB 6; 
Pred. No. 28; 
0; Mismatches 



Length 185; 
1; Indels 



0 ; Gaps 



Qy 



2 LSSRLDAC 9 



1 1 1 1 III 

Db 2 8 LSSRSDAC 3 5 



RESULT 15 
Q8JYB7 

ID Q8JYB7 PRELIMINARY; PRT; 251 AA. 

AC Q8JYB7; 

DT 01-OCT-2002 (TrEMBLrel . 22, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Uracil -DNA-glycosidase . 

OS Porcine lymphotropic herpesvirus 1 . 

OC Viruses; dsDNA viruses, no RNA stage; Herpesviridae ; 

OC Gammaherpesvirinae . 

OX NCBI_TaxID=91740; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN^ sample #56; 

RX MEDLINE-99226949; PubMed-10211967 ; 

RA Ehlers B., Ulrich S., Goltz M. ; 

RT "Detection of two novel porcine herpesviruses with high similarity to 

RT gammaherpes viruses . " ; 

RL J. Gen. Virol. 80:971-978(1999). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN= sample #56; 

RX MEDLINE=20036635; PubMed=10567652 ; 

RA Ulrich S., Goltz M. , Ehlers B.; 

RT "Characterization of the DNA polymerase loci of the novel porcine 

RT lymphotropic herpesviruses 1 and 2 in domestic and feral pigs."; 

RL J. Gen. Virol. 80:3199-3205(1999). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=sample #56; 

RX MEDLINE=22008485; PubMed=12009880 ; 

RA Goltz M . , Ericsson T. , Patience C. , Huang C.A., Noack S., Sachs D.H., 

RA Ehlers B. ; 

RT "Sequence analysis of the genome of porcine lymphotropic herpesvirus 1 

RT and gene expression during post -transplant lymphoprol iterative disease 

RT of pigs . " ; 

RL Virology 294:383-393(2002). 

DR EMBL; AF478169; AAM22145.1; -. 

DR InterPro; I PRO 02 043; UDNA_glycsylse . 

DR InterPro; IPR005122; UDNA_glycsylseSF . 

DR InterPro; IPR003249; U_glycsylse_notp . 

DR Pfam; PF03167; UDG; 1. 

DR ProDom; PD001589; U_glycsylse_notp ; 1. 

DR TIGRFAMs; TIGR00628; ung; 1. 

DR PROSITE; PS00130; U_DNA__GLYCOSYLASE ; 1. 

KW Glycosidase. 

SQ SEQUENCE 251 AA; 28520 MW; 8 60234AE91746565 CRC64 ; 

Query Match 69.4%; Score 34; DB 12; Length 251; 

Best Local Similarity 55.6%; Pred. No. 38; 

Matches 5; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 



Qy 1 CLSSRLDAC 9 

Db 172 CLSDKLNSC 18 0 

Search completed: November 13, 2 003, 09:50:59 
Job time : 25.7188 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: November 13, 2003, 10:27:36 



; Search time 10.125 Seconds 
(without alignments) 
37.610 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 

Searched: 
Word size : 



US-09-228-866-3 
9 

1 CLSSRLDAC 9 
OLIGO 

Gapop 60.0 , Gapext 60.0 
328717 seqs, 42310858 residues 
0 



118358 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 7 
Maximum DB seq length: 21 

Post-processing: Listing first 45 summaries 

Database : Issued_Patents_AA: * 

1: / c gn2 _6 / p t o da t a / 1 / i a a / 5 A_COMB . p ep : * 

2 : /cgn2_6/ptodata/l/iaa/5B_COMB.pep: * 

3 : /cgn2_6/ptodata/l/iaa/6A_C0MB.pep: * 

4 : /cgn2_6/ptodata/l/iaa/6B_C0MB.pep: * 

5 : /cgn2_6/ptodata/ l/iaa/PCTUS_COMB . pep : * 

6: /cgn2_6/ptodata/l/iaa/backf ilesl .pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-08-526-710-3 

; Sequence 3, Application US/08526710 
; Patent No. 5622699 
; GENERAL INFORMATION: 

APPLICANT : Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell and Flores 

STREET: 4370 La Jolla Village Drive, Suite 700 
CITY: San Diego 
STATE: California 
COUNTRY: United States 
ZIP: 92122 
COMPUTER READABLE FORM: 



2 



MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS /MS -DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/526 , 710 

FILING DATE: ll-SEP-1995 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 

REFERENCE/DOCKET NUMBER: P-LJ 1779 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 9 amino acids 

TYPE: amino acid 

TOPOLOGY: circular 
MOLECULE TYPE: peptide 
US-08-526-710-3 



Query Match 100.0%; Score 9; DB 1; Length 9; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CLSSRLDAC 9 

Db 1 CLSSRLDAC 9 



RESULT 2 
US-08-862-855-3 

; Sequence 3, Application US/08862855 

; Patent No. 6068829 

; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 

STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1,0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/862 , 855 

FILING DATE: 



3 



CLASSIFICATION : 424 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 
FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,2 73 
FILING DATE: 10-MAR-1997 
ATTORNEY/AGENT INFORMATION: 
NAME: Campbell, Cathryn A. 
REGISTRATION NUMBER: 31,815 
REFERENCE/ DOCKET NUMBER: P-LJ 2 621 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (619) 535-9001 
TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERI STI CS : 
LENGTH: 9 amino acids 
TYPE: amino acid 
TOPOLOGY : c i r cul ar 
MOLECULE TYPE: peptide 
US-08-862-855-3 

Query Match 100.0%; Score 9; DB 3; Length 9; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 
Qy 1 CLSSRLDAC 9 

Db 1 CLSSRLDAC 9 



RESULT 3 
US-09-226-985-3 

; Sequence 3, Application US/09226985 

; Patent No. 6296832 

; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 
; APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Molecules That Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 
STREET: 4370 La Jolla Village Drive, Suite 700 
; CITY: San Diego 

STATE: California 
COUNTRY: United States 
ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS -DOS 
SOFTWARE: Patent In Release #1.0, Version #1.2 5 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/226 , 985 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 



4 



APPLICATION NUMBER: US 08/526,710 

FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,273 

FILING DATE: 10-MAR-1997 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/862,855 

FILING DATE: 23-MAY-1997 
ATTORNEY/ AGENT INFORMATION: 

NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 

REFERENCE/DOCKET NUMBER: P-LJ 3423 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 9 amino acids 

TYPE: amino acid 

TOPOLOGY: circular 
MOLECULE TYPE: peptide 
US-09-226-985-3 



Query Match 100.0%; Score 9; DB 3; Length 9; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 
Matches 9; Conservative 0; Mismatches 0; Indels 



0 ; Gaps 



0; 



Qy 

Db 



1 CLSSRLDAC 9 

MINIMI 

1 CLSSRLDAC 9 



RESULT 4 
US-09-227-906-3 

; Sequence 3, Application US/09227906 
; Patent No. 6306365 
; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 

STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS -DOS 

SOFTWARE: Patent In Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/227,906 

FILING DATE: 



5 



CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 

FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,273 

FILING DATE: 10 -MAR- 1997 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/862,855 

FILING DATE: 23-MAY-1997 
ATTORNEY/AGENT INFORMATION: 

NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 

REFERENCE/DOCKET NUMBER: P-LJ 3424 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
/ INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 9 amino acids 

TYPE: amino acid 

TOPOLOGY: circular 
MOLECULE TYPE: peptide 
US-09-227-906-3 



Query Match 100.0%; Score 9; DB 4; Length 9; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 
Matches 9; Conservative 0; Mismatches 0; Indels 



0 ; Gaps 



0; 



Qy 

Db 



1 CLSSRLDAC 9 

I I I I 

1 CLSSRLDAC 9 



RESULT 5 

US-08-526-710-19 

; Sequence 19, Application US/08526710 
; Patent No, 5622699 
; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell and Flores 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 

STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC- DOS/MS -DOS 

SOFTWARE: Patent In Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 



6 



APPLICATION NUMBER: US/ 08/526 , 710 

FILING DATE: ll-SEP-1995 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 

REFERENCE/DOCKET NUMBER: P-LJ 1779 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 19: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 7 amino acids 

TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-08-526-710-19 

Query Match 77.8%; Score 7; DB 1; Length 7; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 
Matches 7; Conservative 0; Mismatches 0; Indels 

Qy 2 LSSRLDA 8 

Illllll 
Db 1 LSSRLDA 7 



RESULT 6 

US-08-862-855-19 

; Sequence 19, Application US/08862855 

; Patent No. 6068829 

; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 

STATE: California 

COUNTRY: United States 

ZIP : 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE : Patent In Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/0 8/8 62,855 

FILING DATE: 

CLASSIFICATION: 424 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 

FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 



7 



APPLICATION NUMBER: US 08/813,273 

FILING DATE: 10-MAR-1997 
ATTORNEY/AGENT INFORMATION: 
; NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 

REFERENCE / DOCKET NUMBER: P-LJ 2621 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
; INFORMATION FOR SEQ ID NO: 19: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 7 amino acids 

; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-08-862-855-19 

Query Match 77.8%; Score 7; DB 3; Length 7; 

Best Local Similarity 100,0%; Pred. No. 2.5e+05; 

Matches 7; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 2 LSSRLDA 8 

Illllll 
Db 1 LSSRLDA 7 



RESULT 7 

US-09-226-985-19 

; Sequence 19, Application US/09226985 

; Patent No. 6296832 

; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Molecules That Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 
; STATE: California 

COUNTRY: United States 

ZIP : 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/226,985 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 

FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,273 

FILING DATE: 10-MAR-1997 
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PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: US 08/862,855 

FILING DATE: 23 -MAY- 1997 
ATTORNEY /AGENT INFORMATION: 
/ NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 
REFERENCE / DOCKET NUMBER: P-LJ 3423 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (619) 535-9001 
TELEFAX: (619) 535-8949 
; INFORMATION FOR SEQ ID NO: 19: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 7 amino acids 
; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-09-226-985-19 

Query Match 77.8%; Score 7; DB 3; Length 7; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 

Matches 7; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 2 LSSRLDA 8 

IIIIMI 
Db 1 LSSRLDA 7 



RESULT 8 

US-O9-227-906-19 

; Sequence 19, Application US/09227906 
; Patent No. 63 063 65 
; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 
; STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 

STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/227 , 906 

FILING DATE: 

CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 

FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 
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APPLICATION NUMBER: US 08/813,273 

FILING DATE: 10 -MAR- 19 97 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/862,855 

FILING DATE: 23-MAY-1997 
ATTORNEY/AGENT INFORMATION: 

NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 

REFERENCE/DOCKET NUMBER: P-LJ 3424 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 19: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 7 amino acids 

TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-09-227-906-19 

Query Match 77.8%; Score 7; DB 4; Length 7; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 

Matches 7; Conservative 0; Mismatches 0; Indels 0; 

Qy 2 LSSRLDA 8 

Mlllll 
Db 1 LSSRLDA 7 



RESULT 9 

US-08-469-260A-241 

Sequence 241, Application US/08469260A 
Patent No. 6451578 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



JOHN N. SIMONS 
TAMI J. PILOT-MATIAS 
GEORGE J. DAWSON 
GEORGE G. SCHLAUDER 
SURESH M. DESAI 
THOMAS P. LEARY 
ANTHONY SCOTT MUERHOFF 
JAMES C. ERKER 
SHERI L. BUIJK 
ISA K. MUSHAHWAR 

NON-A, NON-B . NON-C, NON-D, NON-E HEPATITIS 
REAGENTS AND METHODS FOR THEIR USE 
716 



TITLE OF INVENTION: 
TITLE OF INVENTION: 
NUMBER OF SEQUENCES 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: ABBOTT LABORATORIES 

STREET: 100 ABBOTT PARK ROAD 

CITY: ABBOTT PARK 

STATE: IL 

COUNTRY : USA 

ZIP: 60064-3500 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 



D377/AP6D 
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OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE : Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/08 /469 , 260A 

FILING DATE: 
CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/08/424,550 
FILING DATE: 
ATTORNEY / AGENT INFORMATION: 

NAME: POREMBSKI , PRISCILLA E. 
REGISTRATION NUMBER: 33,2 07 
REFERENCE/DOCKET NUMBER: 5527 . PC. 01 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 708-937-6365 
TELEFAX: 708-938-2623 
; INFORMATION FOR SEQ ID NO: 241: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 17 amino acids 
; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-469-260A-241 

Query Match 55.6%; Score 5; DB 4; Length 17; 

Best Local Similarity 100.0%; Pred. No. 20; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; 

Qy 2 LSSRL 6 

Mill 

Db 7 LSSRL 11 



RESULT 10 
US-08-488-446-241 

Sequence 241, Application US/08488446 
Patent No. 6558898 
GENERAL INFORMATION: 

APPLICANT: JOHN N. SIMONS 
APPLICANT: TAMI J. PILOT-MATIAS 
APPLICANT: GEORGE J. DAWSON 
APPLICANT: GEORGE G. SCHLAUDER 
APPLICANT: SURESH M. DESAI 
APPLICANT: THOMAS P. LEARY 
APPLICANT: ANTHONY SCOTT MUERHOFF 
APPLICANT: JAMES C. ERKER 
APPLICANT: SHERI L. BUIJK 
APPLICANT: ISA K. MUSHAHWAR 

TITLE OF INVENTION: NON-A, NON-B. NON-C, NON-D, NON-E HEPATITIS 
TITLE OF INVENTION: REAGENTS AND METHODS FOR THEIR USE 
NUMBER OF SEQUENCES: 716 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: ABBOTT LABORATORIES D377/AP6D 
STREET: 100 ABBOTT PARK ROAD 
CITY: ABBOTT PARK 
STATE: IL 
COUNTRY : USA 
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ZIP: 60064-3500 
COMPUTER READABLE FORM; 

MEDIUM TYPE; Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS /MS -DOS 

SOFTWARE: Patent In Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/488,446 
; FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/08/424,550 

FILING DATE: 
ATTORNEY/AGENT INFORMATION: 

NAME: POREMBSKI, PRISCILLA E. 

REGISTRATION NUMBER: 33,207 

REFERENCE/DOCKET NUMBER: 552 7. PC. 01 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 708-937-6365 

TELEFAX: 708-938-2623 
; INFORMATION FOR SEQ ID NO: 241: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 17 amino acids 

TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-488-446-241 

Query Match 55.6%; Score 5; DB 4; Length 17; 

Best Local Similarity 100.0%; Pred. No. 20; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 2 LSSRL 6 

Mill 

Db 7 LSSRL 11 



RESULT 11 

US-08-467-344A-241 

; Sequence 241, Application US/08467344A 

; Patent No. 6586568 

; GENERAL INFORMATION: 

APPLICANT: JOHN N. SIMONS 

TAMI J. PILOT-MATIAS 
GEORGE J. DAWSON 
GEORGE G. SCHLAUDER 
SURESH M. DESAI 
THOMAS P. LEARY 
ANTHONY SCOTT MUERHOFF 
JAMES C. ERKER 
SHERI L . BUIJK 
ISA K. MUSHAHWAR 
TITLE OF INVENTION: NON-A, NON-B. NON-C, NON-D, NON-E HEPATITIS 

REAGENTS AND METHODS FOR THEIR USE 
NUMBER OF SEQUENCES: 716 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: ABBOTT LABORATORIES D377/AP6D 
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; STREET: 100 ABBOTT PARK ROAD 

CITY; ABBOTT PARK 

STATE: IL 
; COUNTRY: USA 

ZIP: 60064-3500 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS -DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/467 , 344A 

FILING DATE: 07-Jun-19 95 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/424,550 

FILING DATE: <Unknown> 
ATTORNEY/ AGENT INFORMATION: 

NAME: POREMBSKI, PRISCILLA E. 

REGISTRATION NUMBER: 33,2 07 

REFERENCE /DOCKET NUMBER: 5527. PC. 01 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 708-937-6365 

TELEFAX: 708-938-2623 
INFORMATION FOR SEQ ID NO: 241: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 17 amino acids 
; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: protein 

SEQUENCE DESCRIPTION: SEQ ID NO: 241: 
US-08-467-344A-241 

Query Match 55.6%; Score 5; DB 4; Length 17; 

Best Local Similarity 100.0%; Pred. No. 20; 

Matches 5; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 2 LSSRL 6 

Mill 

Db 7 LSSRL 11 



RESULT 12 
US-08-503-062-14 

; Sequence 14, Application US/08503062 
; Patent No. 5723300 
; GENERAL INFORMATION: 

APPLICANT: Denis, Gerald V. 

APPLICANT: Green, Michael R. 

TITLE OF INVENTION: NUCLEAR LOCALIZED TRANSCRIPTION FACTOR 
TITLE OF INVENTION: KINASES AND DIAGNOSTIC ASSAYS RELATED THERETO 
NUMBER OF SEQUENCES: 25 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 

CITY: Boston 

STATE : MA 
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COUNTRY : USA 
; ZIP ; 02110-2804 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS -DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/503,062 

FILING DATE: 10-JUL-1995 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Clark, Paul T. 
; REGISTRATION NUMBER: 30,162 

REFERENCE/DOCKET NUMBER: 04 020/080001 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617/542-5070 

TELEFAX: 617/542-8906 

TELEX: 2 00154 
; INFORMATION FOR SEQ ID NO: 14: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 7 amino acids 
; TYPE: amino acid 

STRANDEDNESS : not relevant 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-503-062-14 

Query Match 44.4%; Score 4; DB 1; Length 7; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 

Matches 4; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 2 LSSR 5 

I I I I 

Dto 3 LSSR 6 



RESULT 13 
US-09-461-697-330 

; Sequence 330, Application US/09461697 
; Patent No, 6277974 
; GENERAL INFORMATION: 

; APPLICANT: COGENT NEUROSCIENCE , Inc. 

; APPLICANT: Lo, Donald C. 

; APPLICANT: Barney, Shawn 

; APPLICANT: Thomas, Mary Beth 

; APPLICANT: Portbury, Stuart D. 

APPLICANT: Puranam, Kasturi 
; APPLICANT: Katz, Lawrence C. 

; TITLE OF INVENTION: COMPOSITIONS AND METHODS FOR DIAGNOSING 

; TITLE OF INVENTION: AND TREATING CONDITIONS, DISORDERS, OR DISEASES INVOLVING 

; TITLE OF INVENTION: CELL DEATH 

; FILE REFERENCE: 10001-005-999 

; CURRENT APPLICATION NUMBER: US/ 09/461 , 697 

; CURRENT FILING DATE: 1999-12-14 

; NUMBER OF SEQ ID NOS : 46 6 

SOFTWARE: FastSEQ for Windows Version 4.0 
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; SEQ ID NO 33 0 
LENGTH: 7 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-461-697-330 

Query Match 44.4%; Score 4; DB 3; Length 7; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 

Matches 4; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 4 SRLD 7 

II I I 

Db 4 SRLD 7 



RESULT 14 
PCT-US96-11495-14 

; Sequence 14, Application PC/TUS9611495 
; GENERAL INFORMATION: 

APPLICANT: UNIVERSITY OF MASSACHUSETTS MEDICAL CENTER 
TITLE OF INVENTION: NUCLEAR LOCALIZED TRANSCRIPTION FACTOR 
TITLE OF INVENTION: KINASES AND DIAGNOSTIC ASSAYS RELATED THERETO 
; NUMBER OF SEQUENCES : 25 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 

CITY: Boston 

STATE: MA 

COUNTRY : USA 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS /MS -DOS 

SOFTWARE: Patentln Release #1.0, Version #1.3 0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US96/11495 
; FILING DATE: 03-JUL-1996 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/503,062 
; FILING DATE: 10-JUL-1995 

CLASSIFICATION: 
ATTORNEY/AGENT INFORMATION: 

NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 

REFERENCE/DOCKET NUMBER: 04 02 0/08 0WO1 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617/542-5070 

TELEFAX: 617/542-8906 

TELEX: 2 00154 
INFORMATION FOR SEQ ID NO: 14: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 7 amino acids 

TYPE: amino acid 

STRANDEDNESS : not relevant 

TOPOLOGY: linear 
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MOLECULE TYPE: protein 
PCT-US96-11495-14 

Query Match 44.4%; Score 4; DB 5; Length 7; 

Best Local Similarity 100.0%; Pred. No. 2,5e+05; 

Matches 4; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 2 LSSR 5 

I I I I 

Db 3 LSSR 6 



RESULT 15 
US-07-834-848-14 

Sequence 14, Application US/07834848 
Patent No. 5436221 
GENERAL INFORMATION: 

APPLICANT: KITAGUCHI , HIROSHI 
APPLICANT: KOMAZAWA, HIROYUKI 
APPLICANT: KOJIMA, MASAYOSHI 
APPLICANT: MORI, HIDETO 
APPLICANT: NISHIKAWA, NAOYUKI 
APPLICANT: SATOH, HIDEAKI 
APPLICANT: ORIKASA, ATSUSHI 
APPLICANT: ONO, MITSUNORI 
APPLICANT: AZUMA, ICHIRO 
APPLICANT: SAIKI, IKUO 

TITLE OF INVENTION: PEPTIDE DERIVATIVES AND APPLICATION 
TITLE OF INVENTION: THEREOF 
NUMBER OF SEQUENCES: 18 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Sughrue, Mion, Zinn, Macpeak, & Seas 
STREET: 2100 Pennsylvania Ave., NW 
CITY: Washington 
STATE : DC 
COUNTRY : USA 
ZIP: 20037-3202 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS -DOS 
SOFTWARE : Patentln Release #1.0, Version #1,25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/834 , 848 
FILING DATE: 19920213 
CLASSIFICATION: 514 
ATTORNEY/AGENT I NFORMAT I ON : 
NAME: Biggart, Waddell A. 
REGISTRATION NUMBER: 24,861 
REFERENCE/DOCKET NUMBER: Q2 84 8 0 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (2 02 ) 2 93 - 7060 
TELEFAX: (202) 293-7860 
TELEX: 6491103 
INFORMATION FOR SEQ ID NO: 14: 
SEQUENCE CHARACTER I STI CS : 
LENGTH: 8 amino acids 
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TYPE: AMINO ACID 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-07-834-848-14 

Query Match 44. 4%; Score 4; DB 1; Length 8; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 
Matches 4; Conservative 0; Mismatches 0,' Indels 

Qy 4 SRLD 7 

I I I I 

Db 4 SRLD 7 



Search completed: November 13, 2003, 10:41:54 
Job time : 10.125 sees 
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GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM protein 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



protein search, using sw model 

November 13, 2003, 09:39:50 ; Search time 10.6875 Seconds 

(without alignments) 
35.630 Million cell updates/sec 

US-09-228-866-3 
49 

1 CLSSRLDAC 9 
BL0SUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 



328717 seqs, 42310858 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



328717 



Database 



I ssued_Pat ent s_AA : * 

1 : /cgn2_6/ptodata/l/iaa/5A_COMB.pep: * 

2 : /cgn2_6/ptodata/l/iaa/5B__COMB.pep: * 

3 : /cgn2_6/ptodata/l/iaa/6A_COMB.pep : * 

4 : /cgn2_6/ptodata/l/iaa/6B_COMB.pep: * 

5 : /cgn2_6/ptodata/l/iaa/PCTUS_COMB.pep: * 

6 : /cgn2_6/ptodata/l/iaa/backf ilesl .pep : * 



Pred. Mo. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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Patent No. 5175384 
Patent No. 518 914 7 
Sequence 7, Appli 
Sequence 2, Appli 
Sequence 4, Appli 
Sequence 3, Appli 



ALIGNMENTS 



RESULT 1 
US-08-526-710-3 

; Sequence 3, Application US/08526710 
; Patent No. 5622699 
GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell and Flores 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 

STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 



2 



MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS -DOS 
SOFTWARE: Patentln Release #1.0; Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/526,710 
FILING DATE: ll-SEP-1995 
CLASSIFICATION: 435 
ATTORNEY / AGENT INFORMATION: 
NAME: Campbell, Cathryn A. 
REGISTRATION NUMBER: 31,815 
REFERENCE/DOCKET NUMBER: P-LJ 1779 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (619) 535-9001 
TELEFAX: (619) 535-8949 
INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 9 amino acids 
TYPE: amino acid 
TOPOLOGY : circular 
MOLECULE TYPE: peptide 
US-08-526-710-3 



Query Match 100.0%; 
Best Local Similarity 100.0%; 
Matches 9; Conservative 



0; 



Score 49; DB 1; Length 9; 
Pred. No. 2.5e+05; 

Mismatches 0; Indels 



0; Gaps 



0; 



Qy 

Db 



1 CLSSRLDAC 9 

Illllllll 
1 CLSSRLDAC 9 



RESULT 2 
US-08-862-855-3 

; Sequence 3, Application US/08862855 
; Patent No. 6068829 
; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 
; APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 
STREET: 4370 La Jolla Village Drive, Suite 700 
; CITY: San Diego 

STATE: California 
COUNTRY: United States 
ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/862 , 855 
FILING DATE: 
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; CLASSIFICATION : 424 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 
FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,273 
FILING DATE: 10-MAR-1997 
ATTORNEY /AGENT INFORMATION: 
; NAME: Campbell, Cathryn A. 

REGISTRATION NUMBER: 31,815 
; REFERENCE / DOCKET NUMBER: P-LJ 2621 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: (619) 535-9001 
TELEFAX: (619) 535-8949 
; INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 9 amino acids 
; TYPE: amino acid 

; TOPOLOGY: circular 

; MOLECULE TYPE: peptide 
US-08-862-855-3 

Query Match 100.0%; Score 49; DB 3; Length 9; 

Best Local Similarity 100-0%; Pred. No. 2,5e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 CLSSRLDAC 9 

MINIMI 
Db 1 CLSSRLDAC 9 



RESULT 3 
US-09-226-985-3 

; Sequence 3, Application US/09226985 
; Patent No. 6296832 
; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 
; APPLICANT: Pasqualini, Renata 

; TITLE OF INVENTION: Molecules That Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES: 44 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Campbell & Flores LLP 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 
; STATE: California 

COUNTRY: United States 

ZIP: 92122 
; COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patent In Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/226 , 985 

FILING DATE: 

CLASSIFICATION: 
; PRIOR APPLICATION DATA: 
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APPLICATION NUMBER: US 08/526,710 
FILING DATE; ll-SEP-1995 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,273 
FILING DATE: 10-MAR-1997 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/862,855 
FILING DATE: 23-MAY-1997 

ATTORNEY/ AGENT INFORMATION: 
NAME: Campbell, Cathryn A. 
REGISTRATION NUMBER: 31,815 
REFERENCE / DOCKET NUMBER: P-LJ 3423 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: (619) 535-9001 
TELEFAX: (619) 535-8949 
; INFORMATION FOR SEQ ID NO: 3: 

SEQUENCE CHARACTERISTICS: 
/ LENGTH : 9 amino acids 

; TYPE: amino acid 

TOPOLOGY: circular 

MOLECULE TYPE: peptide 
US-09-226-985-3 



Query Match 10 0.0%; Score 49; DB 3; Length 9; 

Best Local Similarity 100.0%; Pred. No. 2.5e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 CLSSRLDAC 9 

Illllllll 
Db 1 CLSSRLDAC 9 



RESULT 4 
US-09-227-906-3 

; Sequence 3, Application US/09227906 
; Patent No. 6306365 
; GENERAL INFORMATION: 

APPLICANT: Ruoslahti, Erkki 

APPLICANT: Pasqualini, Renata 

TITLE OF INVENTION: Method of Identifying Molecules That 
TITLE OF INVENTION: Home to a Selected Organ In Vivo 
NUMBER OF SEQUENCES : 44 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: Campbell & Flores LLP 

STREET: 4370 La Jolla Village Drive, Suite 700 

CITY: San Diego 
; STATE: California 

COUNTRY: United States 

ZIP: 92122 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS -DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/22 7 , 906 
; FILING DATE: 
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CLASSIFICATION : 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/526,710 
FILING DATE: ll-SEP-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/813,273 
; FILING DATE: 10 -MAR- 19 97 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/862,855 
FILING DATE: 23-MAY-1997 
ATTORNEY/AGENT INFORMATION: 
NAME: Campbell, Cathryn A. 
; REGISTRATION NUMBER: 31,815 

REFERENCE/DOCKET NUMBER: P-LJ 3424 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: (619) 535-9001 

TELEFAX: (619) 535-8949 
; INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS : 
LENGTH: 9 amino acids 
TYPE: amino acid 
TOPOLOGY: circular 
MOLECULE TYPE: peptide 
US-09-227-906-3 

Query Match 100.0%; Score 49; DB 4; Length 9; 

Best Local Similarity 100,0%; Pred. No. 2.5e+05; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 CLSSRLDAC 9 

MINIMI 

Db 1 CLSSRLDAC 9 



RESULT 5 

US-08-936-165A-491 

Sequence 491, Application US/08936165A 
Patent No. 6348582 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Black, Michael 
Burnham, Martin 
Hodgson, 
Knowles , 
Lonetto, 
Nicholas , 



John 
David 
Michael 
Richard 
Pratt, Julie 
Reichard, Richard 
Rosenberg, Martin 
Ward, Judith 

TITLE OF INVENTION: No. 6348582el Prokaryotic Polynucleotides, 
TITLE OF INVENTION: Polypeptides and Their Uses 
NUMBER OF SEQUENCES: 534 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: SmithKline Beecham Corporation 

STREET: 7 09 Swedeland Road 

CITY: King of Prussia 

STATE : PA 
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COUNTRY : USA 

ZIP: 19406-0939 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ for Windows Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/936, 165A 

FILING DATE: 24-SEP-1997 

CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 60/027,032 

FILING DATE: 24-SEP-1996 
ATTORNEY/AGENT INFORMATION: 

NAME: Gimmi, Edward R 

REGISTRATION NUMBER: 38,8 91 

REFERENCE/DOCKET NUMBER: P5054 9 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 610-270-4478 

TELEFAX: 610-270-5090 

TELEX : 

; INFORMATION FOR SEQ ID NO: 491: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 130 amino acids 
TYPE: amino acid 
STRANDEDNESS : s ingl e 
TOPOLOGY: linear 
MOLECULE TYPE: Protein 
US-08-936-165A-491 

Query Match 77,6%; Score 38; DB 4; Length 13 0; 

Best Local Similarity 77.8%; Pred. No. 3.5; 

Matches 7; Conservative 0; Mismatches 2; Indels 0; Gaps 0 

Qy 1 CLSSRLDAC 9 

II II III 
Db 112 CLLSRCDAC 120 



RESULT 6 

US-09-252-991A-28480 

; Sequence 28480, Application US/09252991A 
; Patent No. 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 

; CURRENT APPLICATION NUMBER: US/09/252 , 991A 

; CURRENT FILING DATE: 1999-02-18 

; PRIOR APPLICATION NUMBER: US 60/074,788 

; PRIOR FILING DATE: 1998-02-18 

; PRIOR APPLICATION NUMBER: US 60/094,190 

; PRIOR FILING DATE: 1998-07-27 

; NUMBER OF SEQ ID NOS : 33142 
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; SEQ ID NO 28480 
LENGTH: 357 
TYPE: PRT 

ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-284 8 0 



Query Match 75.5%; Score 37; DB 4; Length 357; 

Best Local Similarity 87.5%; Pred. No. 16; 

Matches 7; Conservative 1; Mismatches 0; Indels 0; Gaps 
Qy 2 LSSRLDAC 9 

Db 2 98 VSSRLDAC 3 05 



RESULT 7 

US-09-252-991A-26162 

; Sequence 26162, Application US/09252991A 
; Patent No, 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 

; CURRENT APPLICATION NUMBER: US/09/252 , 991A 

; CURRENT FILING DATE: 1999-02-18 

; PRIOR APPLICATION NUMBER: US 60/074,788 

; PRIOR FILING DATE: 1998-02-18 

; PRIOR APPLICATION NUMBER: US 60/094,190 

; PRIOR FILING DATE: 1998-07-27 

; NUMBER OF SEQ ID NOS : 33142 

; SEQ ID NO 26162 

LENGTH: 232 

TYPE: PRT 

ORGANISM: Pseudomonas aeruginosa 
US» 09-252 -991A-26162 



Query Match 71.4%; Score 35; DB 4; Length 232; 

Best Local Similarity 66.7%; Pred. No. 24; 

Matches 6; Conservative 1; Mismatches 2; Indels 0; Gaps 
Qy 1 CLSSRLDAC 9 

I Ih III 

Db 128 CRSSKADAC 136 



RESULT 8 
US-08-477-451-4 

; Sequence 4, Application US/08477451 
; Patent No. 5928865 
; GENERAL INFORMATION: 

APPLICANT: Covacci, Antonello 

TITLE OF INVENTION: Helicobacter Pylori Cagi Region 
NUMBER OF SEQUENCES: 4 6 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Chiron Corporation 
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STREET: 4560 Horton Street 

CITY: Emeryville 

STATE : CA 

COUNTRY: USA 

ZIP: 94608-2916 
COMPUTER READABLE FORM : 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1,30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/477,451 

FILING DATE: 07-JUN-1995 

CLASSIFICATION : 435 
ATTORNEY/AGENT INFORMATION: 

NAME: McClung, Barbara G. 

REGISTRATION NUMBER: 33,113 

REFERENCE / DOCKET NUMBER: 0335.002 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 510-601-2708 

TELEFAX: 510-655-3542 
; INFORMATION FOR SEQ ID NO: 4: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 3177 amino acids 

TYPE: amino acid 

STRANDEDNESS : single 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-477-451-4 

Query Match 71.4%; Score 35; DB 2; Length 3177; 

Best Local Similarity 77.8%; Pred. No. 3.7e+02; 

Matches 7; Conservative 0; Mismatches 2; Indels 0; Gaps 0 
Qy 1 CLSSRLDAC 9 

I II MM 

Db 1497 CESSPLDAC 1505 



RESULT 9 

US-08-373-134D-2 

; Sequence 2, Application US/08373134D 

; Patent No. 5780296 

; GENERAL INFORMATION; 

APPLICANT: Kmiec, Eric 

APPLICANT: Holloman, William 

TITLE OF INVENTION: COMPOSITIONS AND METHODS TO PROMOTE 
TITLE OF INVENTION: HOMOLOGOUS RECOMBINATION IN EUKARYOTIC CELLS AND 
ORGANISMS 

NUMBER OF SEQUENCES: 15 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Pennie & Edmonds 

STREET: 1155 Avenue of the Americas 

CITY: New York 

STATE: New York 

COUNTRY: USA 

ZIP: 10036-2711 
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